2009 IEEE Conference on Computer Vision and Pattern Recognition最新文献_第10页

Nonparametric scene parsing: Label transfer via dense scene alignment 非参数场景解析:通过密集场景对齐进行标签转移

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206536

Ce Liu, Jenny Yuen, A. Torralba

引用次数: 361

Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling 基于斑块的动态外观建模和自适应盆地跳蒙特卡罗采样的非刚性物体跟踪

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206502

Junseok Kwon, Kyoung Mu Lee

{"title":"Tracking of a non-rigid object via patch-based dynamic appearance modeling and adaptive Basin Hopping Monte Carlo sampling","authors":"Junseok Kwon, Kyoung Mu Lee","doi":"10.1109/CVPR.2009.5206502","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206502","url":null,"abstract":"We propose a novel tracking algorithm for the target of which geometric appearance changes drastically over time. To track it, we present a local patch-based appearance model and provide an efficient scheme to evolve the topology between local patches by on-line update. In the process of on-line update, the robustness of each patch in the model is estimated by a new method of measurement which analyzes the landscape of local mode of the patch. This patch can be moved, deleted or newly added, which gives more flexibility to the model. Additionally, we introduce the Basin Hopping Monte Carlo (BHMC) sampling method to our tracking problem to reduce the computational complexity and deal with the problem of getting trapped in local minima. The BHMC method makes it possible for our appearance model to consist of enough numbers of patches. Since BHMC uses the same local optimizer that is used in the appearance modeling, it can be efficiently integrated into our tracking framework. Experimental results show that our approach tracks the object whose geometric appearance is drastically changing, accurately and robustly.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121513086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 241

Random walks on graphs to model saliency in images 在图形上随机行走以模拟图像的显著性

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206767

Viswanath Gopalakrishnan, Yiqun Hu, D. Rajan

引用次数: 100

Recognising action as clouds of space-time interest points 将行动识别为时空兴趣点的云

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206779

Matteo Bregonzio, S. Gong, T. Xiang

{"title":"Recognising action as clouds of space-time interest points","authors":"Matteo Bregonzio, S. Gong, T. Xiang","doi":"10.1109/CVPR.2009.5206779","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206779","url":null,"abstract":"Much of recent action recognition research is based on space-time interest points extracted from video using a Bag of Words (BOW) representation. It mainly relies on the discriminative power of individual local space-time descriptors, whilst ignoring potentially valuable information about the global spatio-temporal distribution of interest points. In this paper, we propose a novel action recognition approach which differs significantly from previous interest points based approaches in that only the global spatiotemporal distribution of the interest points are exploited. This is achieved through extracting holistic features from clouds of interest points accumulated over multiple temporal scales followed by automatic feature selection. Our approach avoids the non-trivial problems of selecting the optimal space-time descriptor, clustering algorithm for constructing a codebook, and selecting codebook size faced by previous interest points based methods. Our model is able to capture smooth motions, robust to view changes and occlusions at a low computation cost. Experiments using the KTH and WEIZMANN datasets demonstrate that our approach outperforms most existing methods.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127597989","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 415

Boosted multi-task learning for face verification with applications to web image and video search 通过网络图像和视频搜索增强了面部验证的多任务学习

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206736

Xiaogang Wang, Cha Zhang, Zhengyou Zhang

{"title":"Boosted multi-task learning for face verification with applications to web image and video search","authors":"Xiaogang Wang, Cha Zhang, Zhengyou Zhang","doi":"10.1109/CVPR.2009.5206736","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206736","url":null,"abstract":"Face verification has many potential applications including filtering and ranking image/video search results on celebrities. Since these images/videos are taken under uncontrolled environments, the problem is very challenging due to dramatic lighting and pose variations, low resolutions, compression artifacts, etc. In addition, the available number of training images for each celebrity may be limited, hence learning individual classifiers for each person may cause overfitting. In this paper, we propose two ideas to meet the above challenges. First, we propose to use individual bins, instead of whole histograms, of Local Binary Patterns (LBP) as features for learning, which yields significant performance improvements and computation reduction in our experiments. Second, we present a novel Multi-Task Learning (MTL) framework, called Boosted MTL, for face verification with limited training data. It jointly learns classifiers for multiple people by sharing a few boosting classifiers in order to avoid overfitting. The effectiveness of Boosted MTL and LBP bin features is verified with a large number of celebrity images/videos from the web.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127440756","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 125

Distributed volumetric scene geometry reconstruction with a network of distributed smart cameras 基于分布式智能摄像头网络的分布式体景几何重建

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206589

Shubao Liu, Kongbin Kang, Jean-Philippe Tarel, D. Cooper

{"title":"Distributed volumetric scene geometry reconstruction with a network of distributed smart cameras","authors":"Shubao Liu, Kongbin Kang, Jean-Philippe Tarel, D. Cooper","doi":"10.1109/CVPR.2009.5206589","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206589","url":null,"abstract":"Central to many problems in scene understanding based on using a network of tens, hundreds or even thousands of randomly distributed cameras with on-board processing and wireless communication capability is the “efficient” reconstruction of the 3D geometry structure in the scene. What is meant by “efficient” reconstruction? In this paper we investigate this from different aspects in the context of visual sensor networks and offer a distributed reconstruction algorithm roughly meeting the following goals: 1. Close to achievable 3D reconstruction accuracy and robustness; 2. Minimization of the processing time by adaptive computing-job distribution among all the cameras in the network and asynchronous parallel processing; 3. Communication Optimization and minimization of the (battery-stored) energy, by reducing and localizing the communications between cameras. A volumetric representation of the scene is reconstructed with a shape from apparent contour algorithm, which is suitable for distributed processing because it is essentially a local operation in terms of the involved cameras, and apparent contours are robust to ourdoor illumination conditions. Each camera processes its own image and performs the computation for a small subset of voxels, and updates the voxels through collaborating with its neighbor cameras. By exploring the structure of the reconstruction algorithm, we design the minimum-spanning-tree (MST) message passing protocol in order to minimize the communication. Of interest is that the resulting system is an example of “swarm behavior”. 3D reconstruction is illustrated using two real image sets, running on a single computer. The iterative computations used in the single processor experiment are exactly the same as are those used in the network computations. Distributed concepts and algorithms for network control and communication performance are theoretical designs and estimates.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127455516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Multiphase geometric couplings for the segmentation of neural processes 神经过程分割的多相几何耦合

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206524

Amelio Vázquez Reina, E. Miller, H. Pfister

{"title":"Multiphase geometric couplings for the segmentation of neural processes","authors":"Amelio Vázquez Reina, E. Miller, H. Pfister","doi":"10.1109/CVPR.2009.5206524","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206524","url":null,"abstract":"The ability to constrain the geometry of deformable models for image segmentation can be useful when information about the expected shape or positioning of the objects in a scene is known a priori. An example of this occurs when segmenting neural cross sections in electron microscopy. Such images often contain multiple nested boundaries separating regions of homogeneous intensities. For these applications, multiphase level sets provide a partitioning framework that allows for the segmentation of multiple deformable objects by combining several level set functions. Although there has been much effort in the study of statistical shape priors that can be used to constrain the geometry of each partition, none of these methods allow for the direct modeling of geometric arrangements of partitions. In this paper, we show how to define elastic couplings between multiple level set functions to model ribbon-like partitions. We build such couplings using dynamic force fields that can depend on the image content and relative location and shape of the level set functions. To the best of our knowledge, this is the first work that shows a direct way of geometrically constraining multiphase level sets for image segmentation. We demonstrate the robustness of our method by comparing it with previous level set segmentation methods.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124424459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 50

Learning to detect unseen object classes by between-class attribute transfer 学习通过类间属性转移来检测不可见的对象类

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206594

Christoph H. Lampert, H. Nickisch, S. Harmeling

{"title":"Learning to detect unseen object classes by between-class attribute transfer","authors":"Christoph H. Lampert, H. Nickisch, S. Harmeling","doi":"10.1109/CVPR.2009.5206594","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206594","url":null,"abstract":"We study the problem of object classification when training and test classes are disjoint, i.e. no training examples of the target classes are available. This setup has hardly been studied in computer vision research, but it is the rule rather than the exception, because the world contains tens of thousands of different object classes and for only a very few of them image, collections have been formed and annotated with suitable class labels. In this paper, we tackle the problem by introducing attribute-based classification. It performs object detection based on a human-specified high-level description of the target objects instead of training images. The description consists of arbitrary semantic attributes, like shape, color or even geographic information. Because such properties transcend the specific learning task at hand, they can be pre-learned, e.g. from image datasets unrelated to the current task. Afterwards, new classes can be detected based on their attribute representation, without the need for a new training phase. In order to evaluate our method and to facilitate research in this area, we have assembled a new large-scale dataset, “Animals with Attributes”, of over 30,000 animal images that match the 50 classes in Osherson's classic table of how strongly humans associate 85 semantic attributes with animal classes. Our experiments show that by using an attribute layer it is indeed possible to build a learning object detection system that does not require any training images of the target classes.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132650627","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2283

Blind motion deblurring from a single image using sparse approximation 利用稀疏逼近对单幅图像进行盲运动去模糊

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206743

Jian-Feng Cai, Hui Ji, Chaoqiang Liu, Zuowei Shen

{"title":"Blind motion deblurring from a single image using sparse approximation","authors":"Jian-Feng Cai, Hui Ji, Chaoqiang Liu, Zuowei Shen","doi":"10.1109/CVPR.2009.5206743","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206743","url":null,"abstract":"Restoring a clear image from a single motion-blurred image due to camera shake has long been a challenging problem in digital imaging. Existing blind deblurring techniques either only remove simple motion blurring, or need user interactions to work on more complex cases. In this paper, we present an approach to remove motion blurring from a single image by formulating the blind blurring as a new joint optimization problem, which simultaneously maximizes the sparsity of the blur kernel and the sparsity of the clear image under certain suitable redundant tight frame systems (curvelet system for kernels and framelet system for images). Without requiring any prior information of the blur kernel as the input, our proposed approach is able to recover high-quality images from given blurred images. Furthermore, the new sparsity constraints under tight frame systems enable the application of a fast algorithm called linearized Bregman iteration to efficiently solve the proposed minimization problem. The experiments on both simulated images and real images showed that our algorithm can effectively removing complex motion blurring from nature images.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132291167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 317

The geometry of 2D image signals 二维图像信号的几何特性

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206784

Lennart Wietzke, G. Sommer, O. Fleischmann

引用次数: 28