纳金网

标题: Data-driven Visual Similarity for Cross-domain Image Matching [打印本页]

作者: 晃晃 时间: 2011-12-28 09:41
标题: Data-driven Visual Similarity for Cross-domain Image Matching
Data-driven Visual Similarity for Cross-domain Image Matching

Abhinav Shrivastava Tomasz Malisiewicz Abhinav Gupta Alexei A. Efros

Carnegie Mellon University MIT Carnegie Mellon University Carnegie Mellon University

Abstract

The goal of this work is to ﬁnd visually similar images even if they

appear quite different at the raw pixel level. This task is particu-

larly important for matching images across visual domains, such

as photos taken over different seasons or lighting conditions, paint-

ings, hand-drawn sketches, etc. We propose a surprisingly simple

method that estimates the relative importance of different features

in a query image based on the notion of “data-driven uniqueness”.

We employ standard tools from discriminative object detection in

a novel way, yielding a generic approach that does not depend on

a particular image representation or a speciﬁc visual domain. Our

approach shows good performance on a number of difﬁcult cross-

domain visual tasks e.g., matching paintings or sketches to real

photographs. The method also allows us to demonstrate novel ap-

plications such as Internet re-photography, and painting2gps.

While at present the technique is too computationally intensive to

be practical for interactive image retrieval, we hope that some of

the ideas will eventually become applicable to that domain as well.

CR Categories: I.2.10 [Artiﬁcial Intelligence]: Vision and Scene

Understanding—Learning; I.4.10 [Image Processing and Computer

Vision]: Image Representation—Statistical;

Keywords: image matching, visual similarity, saliency, image re-

trieval, paintings, sketches, re-photography, visual memex

and computational photography. Unlike traditional methods, which

employ parametric models to capture visual phenomena, the data-

driven approaches use visual data directly, without an explicit inter-

mediate representation. These approaches have shown promising

results on a wide range of challenging computer graphics problems,

including super-resolution and de-noising [Freeman et al. 2002;

Buades et al. 2005; HaCohen et al. 2010], texture and video syn-

thesis [Efros and Freeman 2001; Schodl et al. 2000], image analo-

gies [Hertzmann et al. 2001], automatic colorization [Torralba et al.

2008], scene and video completion [Wexler et al. ; Hays and Efros

2007; Whyte et al. 2009], photo restoration [Dale et al. 2009], as-

sembling photo-realistic virtual spaces [Kaneva et al. 2010; Chen

et al. 2009], and even making CG imagery more realistic [Johnson

et al. 2010], to give but a few examples.

The central element common to all the above approaches is search-

ing a large dataset to ﬁnd visually similar matches to a given query

– be it an image patch, a full image, or a spatio-temporal block.

However, deﬁning a good visual similarity metric to use for match-

ing can often be surprisingly difﬁcult. Granted, in many situations

where the data is reasonably homogeneous (e.g., different patches

within the same texture image [Efros and Freeman 2001], or dif-

ferent frames within the same video [Schodl et al. 2000]), a simple

pixel-wise sum-of-squared-differences (L2) matching works quite

well. But what about the cases when the visual content is only sim-

ilar on the higher scene level, but quite dissimilar on the pixel level?

For instance, methods that use scene matching e.g., [Hays and Efros

2007; Dale et al. 2009] often need to match images across different

illuminations, different seasons, different cameras, etc. Likewise,

retexturing an image in the style of a painting [Hertzmann et al.

2001; Efros and Freeman 2001] requires making visual correspon-

dence between two very different domains – photos and paintings.

Cross-domain matching is even more critical for applications such

as Sketch2Photo [Chen et al. 2009] and CG2Real [Johnson et al.

2010], which aim to bring domains as different as sketches and CG

renderings into correspondence with natural photographs. In all of

these cases, pixel-wise matching fares quite poorly, because small

perceptual differences can result in arbitrarily large pixel-wise dif-

ferences. What is needed is a visual metric that can capture the

important visual structures that make two images appear similar,

yet show robustness to small, unimportant visual details. This is

precisely what makes this problem so difﬁcult – the visual similar-

ity algorithm somehow needs to know which visual structures are

important for a human observer and which are not.

作者: 菜刀吻电线 时间: 2012-3-8 23:21
不错非常经典实用

作者: 奇 时间: 2012-3-16 23:31
爱咋咋地！

作者: 晃晃 时间: 2012-3-21 23:26
此地無銀。。。

作者: C.R.CAN 时间: 2012-7-8 23:25
呵呵，很好，方便罗。

作者: 奇 时间: 2012-8-6 00:21
不错哦,谢谢楼主

作者: 奇 时间: 2012-9-19 10:00
非常感谢，管理员设置了需要对新回复进行审核，您的帖子通过审核后将被显示出来，现在将转入主题

作者: 奇 时间: 2012-12-14 23:21
俺是新人,这厢有礼了!

作者: tc 时间: 2012-12-16 23:19
凡系斑竹滴话要听；凡系朋友滴帖要顶

作者: 晃晃 时间: 2013-3-6 23:27
真不错,全存下来了.

欢迎光临纳金网 (http://rs.narkii.com/club/)