2009 IEEE Conference on Computer Vision and Pattern Recognition最新文献_第7页

Vocabulary hierarchy optimization for effective and transferable retrieval 面向有效可转移检索的词汇层次优化

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206680

R. Ji, Xing Xie, H. Yao, Wei-Ying Ma

{"title":"Vocabulary hierarchy optimization for effective and transferable retrieval","authors":"R. Ji, Xing Xie, H. Yao, Wei-Ying Ma","doi":"10.1109/CVPR.2009.5206680","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206680","url":null,"abstract":"Scalable image retrieval systems usually involve hierarchical quantization of local image descriptors, which produces a visual vocabulary for inverted indexing of images. Although hierarchical quantization has the merit of retrieval efficiency, the resulting visual vocabulary representation usually faces two crucial problems: (1) hierarchical quantization errors and biases in the generation of “visual words”; (2) the model cannot adapt to database variance. In this paper, we describe an unsupervised optimization strategy in generating the hierarchy structure of visual vocabulary, which produces a more effective and adaptive retrieval model for large-scale search. We adopt a novel density-based metric learning (DML) algorithm, which corrects word quantization bias without supervision in hierarchy optimization, based on which we present a hierarchical rejection chain for efficient online search based on the vocabulary hierarchy. We also discovered that by hierarchy optimization, efficient and effective transfer of a retrieval model across different databases is feasible. We deployed a large-scale image retrieval system using a vocabulary tree model to validate our advances. Experiments on UKBench and street-side urban scene databases demonstrated the effectiveness of our hierarchy optimization approach in comparison with state-of-the-art methods.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121663228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 28

Reducing JointBoost-based multiclass classification to proximity search 将基于jointboost的多类分类简化为接近搜索

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206687

Alexandra Stefan, V. Athitsos, Quan Yuan, S. Sclaroff

{"title":"Reducing JointBoost-based multiclass classification to proximity search","authors":"Alexandra Stefan, V. Athitsos, Quan Yuan, S. Sclaroff","doi":"10.1109/CVPR.2009.5206687","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206687","url":null,"abstract":"Boosted one-versus-all (OVA) classifiers are commonly used in multiclass problems, such as generic object recognition, biometrics-based identification, or gesture recognition. JointBoost is a recently proposed method where OVA classifiers are trained jointly and are forced to share features. JointBoost has been demonstrated to lead both to higher accuracy and smaller classification time, compared to using OVA classifiers that were trained independently and without sharing features. However, even with the improved efficiency of JointBoost, the time complexity of OVA-based multiclass recognition is still linear to the number of classes, and can lead to prohibitively large running times in domains with a very large number of classes. In this paper, it is shown that JointBoost-based recognition can be reduced, at classification time, to nearest neighbor search in a vector space. Using this reduction, we propose a simple and easy-to-implement vector indexing scheme based on principal component analysis (PCA). In our experiments, the proposed method achieves a speedup of two orders of magnitude over standard JointBoost classification, in a hand pose recognition system where the number of classes is close to 50,000, with negligible loss in classification accuracy. Our method also yields promising results in experiments on the widely used FRGC-2 face recognition dataset, where the number of classes is 535.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121473635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Optimal scanning for faster object detection 最佳扫描更快的目标检测

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206540

N. Butko, J. Movellan

{"title":"Optimal scanning for faster object detection","authors":"N. Butko, J. Movellan","doi":"10.1109/CVPR.2009.5206540","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206540","url":null,"abstract":"Recent years have seen the development of fast and accurate algorithms for detecting objects in images. However, as the size of the scene grows, so do the running-times of these algorithms. If a 128×102 pixel image requires 20 ms to process, searching for objects in a 1280×1024 image will take 2 s. This is unsuitable under real-time operating constraints: by the time a frame has been processed, the object may have moved. An analogous problem occurs when controlling robot camera that need to scan scenes in search of target objects. In this paper, we consider a method for improving the run-time of general-purpose object-detection algorithms. Our method is based on a model of visual search in humans, which schedules eye fixations to maximize the long-term information accrued about the location of the target of interest. The approach can be used to drive robot cameras that physically scan scenes or to improve the scanning speed for very large high resolution images. We consider the latter application in this work by simulating a “digital fovea” and sequentially placing it in various regions of an image in a way that maximizes the expected information gain. We evaluate the approach using the OpenCV version of the Viola-Jones face detector. After accounting for all computational overhead introduced by the fixation controller, the approach doubles the speed of the standard Viola-Jones detector at little cost in accuracy.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122784994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 110

Robustifying eye center localization by head pose cues 头部姿势线索增强眼中心定位

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206640

R. Valenti, Zeynep Yücel, T. Gevers

{"title":"Robustifying eye center localization by head pose cues","authors":"R. Valenti, Zeynep Yücel, T. Gevers","doi":"10.1109/CVPR.2009.5206640","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206640","url":null,"abstract":"Head pose and eye location estimation are two closely related issues which refer to similar application areas. In recent years, these problems have been studied individually in numerous works in the literature. Previous research shows that cylindrical head models and isophote based schemes provide satisfactory precision in head pose and eye location estimation, respectively. However, the eye locator is not adequate to accurately locate eye in the presence of extreme head poses. Therefore, head pose cues may be suited to enhance the accuracy of eye localization in the presence of severe head poses. In this paper, a hybrid scheme is proposed in which the transformation matrix obtained from the head pose is used to normalize the eye regions and, in turn the transformation matrix generated by the found eye location is used to correct the pose estimation procedure. The scheme is designed to (1) enhance the accuracy of eye location estimations in low resolution videos, (2) to extend the operating range of the eye locator and (3) to improve the accuracy and re-initialization capabilities of the pose tracker. From the experimental results it can be derived that the proposed unified scheme improves the accuracy of eye estimations by 16% to 23%. Further, it considerably extends its operating range by more than 15°, by overcoming the problems introduced by extreme head poses. Finally, the accuracy of the head pose tracker is improved by 12% to 24%.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132528248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 59

Geometric reasoning for single image structure recovery 单幅图像结构恢复的几何推理

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206872

David C. Lee, M. Hebert, T. Kanade

引用次数: 467

An empirical Bayes approach to contextual region classification 上下文区域分类的经验贝叶斯方法

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206690

S. Lazebnik, M. Raginsky

引用次数: 27

Robust unsupervised segmentation of degraded document images with topic models 基于主题模型的退化文档图像鲁棒无监督分割

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206606

Timothy J. Burns, Jason J. Corso

{"title":"Robust unsupervised segmentation of degraded document images with topic models","authors":"Timothy J. Burns, Jason J. Corso","doi":"10.1109/CVPR.2009.5206606","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206606","url":null,"abstract":"Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult. Most current methods combine text extraction and heuristics for segmentation, but text extraction is prone to failure and measuring accuracy remains a difficult challenge. Furthermore, when presented with significant degradation many common heuristic methods fall apart. In this paper, we propose a Bayesian generative model for document images which seeks to overcome some of these drawbacks. Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. We attempt no text extraction, but rather use discrete patch-based codebook learning to make our probabilistic representation feasible. Each latent region topic is a distribution over these patch indices. We capture rough document layout with an MRF Potts model. We take an analysis by synthesis approach to examine the model, and provide quantitative segmentation results on a manually labeled document image data set. We illustrate our model's robustness by providing results on a highly degraded version of our test set.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

LidarBoost: Depth superresolution for ToF 3D shape scanning LidarBoost:用于ToF 3D形状扫描的深度超分辨率

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206804

Sebastian Schuon, C. Theobalt, James Davis, S. Thrun

引用次数: 230

Illumination and spatially varying specular reflectance from a single view 单个视图的光照和空间变化的镜面反射率

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206764

K. Hara, K. Nishino

{"title":"Illumination and spatially varying specular reflectance from a single view","authors":"K. Hara, K. Nishino","doi":"10.1109/CVPR.2009.5206764","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206764","url":null,"abstract":"Estimating the illumination and the reflectance properties of an object surface from a sparse set of images is an important but inherently ill-posed problem. The problem becomes even harder if we wish to account for the spatial variation of material properties on the surface. In this paper, we derive a novel method for estimating the spatially varying specular reflectance properties, of a surface of known geometry, as well as the illumination distribution from a specular-only image, for instance, captured using polarization to separate reflection components. Unlike previous work, we do not assume the illumination to be a single point light source. We model specular reflection with a spherical statistical distribution and encode the spatial variation with radial basis functions of its parameters. This allows us to formulate the simultaneous estimation of spatially varying specular reflectance and illumination as a sound probabilistic inference problem, in particular, using Csiszar's I-divergence measure. To solve it, we derive an iterative algorithm similar to expectation maximization. We demonstrate the effectiveness of the method on synthetic and real-world scenes.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115439556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Simultaneous image classification and annotation 同时进行图像分类和标注

2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206800

Chong Wang, D. Blei, Li Fei-Fei

引用次数: 612