2009 IEEE Conference on Computer Vision and Pattern Recognition最新文献

筛选
英文 中文
Vocabulary hierarchy optimization for effective and transferable retrieval 面向有效可转移检索的词汇层次优化
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206680
R. Ji, Xing Xie, H. Yao, Wei-Ying Ma
{"title":"Vocabulary hierarchy optimization for effective and transferable retrieval","authors":"R. Ji, Xing Xie, H. Yao, Wei-Ying Ma","doi":"10.1109/CVPR.2009.5206680","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206680","url":null,"abstract":"Scalable image retrieval systems usually involve hierarchical quantization of local image descriptors, which produces a visual vocabulary for inverted indexing of images. Although hierarchical quantization has the merit of retrieval efficiency, the resulting visual vocabulary representation usually faces two crucial problems: (1) hierarchical quantization errors and biases in the generation of “visual words”; (2) the model cannot adapt to database variance. In this paper, we describe an unsupervised optimization strategy in generating the hierarchy structure of visual vocabulary, which produces a more effective and adaptive retrieval model for large-scale search. We adopt a novel density-based metric learning (DML) algorithm, which corrects word quantization bias without supervision in hierarchy optimization, based on which we present a hierarchical rejection chain for efficient online search based on the vocabulary hierarchy. We also discovered that by hierarchy optimization, efficient and effective transfer of a retrieval model across different databases is feasible. We deployed a large-scale image retrieval system using a vocabulary tree model to validate our advances. Experiments on UKBench and street-side urban scene databases demonstrated the effectiveness of our hierarchy optimization approach in comparison with state-of-the-art methods.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121663228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Reducing JointBoost-based multiclass classification to proximity search 将基于jointboost的多类分类简化为接近搜索
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206687
Alexandra Stefan, V. Athitsos, Quan Yuan, S. Sclaroff
{"title":"Reducing JointBoost-based multiclass classification to proximity search","authors":"Alexandra Stefan, V. Athitsos, Quan Yuan, S. Sclaroff","doi":"10.1109/CVPR.2009.5206687","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206687","url":null,"abstract":"Boosted one-versus-all (OVA) classifiers are commonly used in multiclass problems, such as generic object recognition, biometrics-based identification, or gesture recognition. JointBoost is a recently proposed method where OVA classifiers are trained jointly and are forced to share features. JointBoost has been demonstrated to lead both to higher accuracy and smaller classification time, compared to using OVA classifiers that were trained independently and without sharing features. However, even with the improved efficiency of JointBoost, the time complexity of OVA-based multiclass recognition is still linear to the number of classes, and can lead to prohibitively large running times in domains with a very large number of classes. In this paper, it is shown that JointBoost-based recognition can be reduced, at classification time, to nearest neighbor search in a vector space. Using this reduction, we propose a simple and easy-to-implement vector indexing scheme based on principal component analysis (PCA). In our experiments, the proposed method achieves a speedup of two orders of magnitude over standard JointBoost classification, in a hand pose recognition system where the number of classes is close to 50,000, with negligible loss in classification accuracy. Our method also yields promising results in experiments on the widely used FRGC-2 face recognition dataset, where the number of classes is 535.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121473635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Optimal scanning for faster object detection 最佳扫描更快的目标检测
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206540
N. Butko, J. Movellan
{"title":"Optimal scanning for faster object detection","authors":"N. Butko, J. Movellan","doi":"10.1109/CVPR.2009.5206540","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206540","url":null,"abstract":"Recent years have seen the development of fast and accurate algorithms for detecting objects in images. However, as the size of the scene grows, so do the running-times of these algorithms. If a 128×102 pixel image requires 20 ms to process, searching for objects in a 1280×1024 image will take 2 s. This is unsuitable under real-time operating constraints: by the time a frame has been processed, the object may have moved. An analogous problem occurs when controlling robot camera that need to scan scenes in search of target objects. In this paper, we consider a method for improving the run-time of general-purpose object-detection algorithms. Our method is based on a model of visual search in humans, which schedules eye fixations to maximize the long-term information accrued about the location of the target of interest. The approach can be used to drive robot cameras that physically scan scenes or to improve the scanning speed for very large high resolution images. We consider the latter application in this work by simulating a “digital fovea” and sequentially placing it in various regions of an image in a way that maximizes the expected information gain. We evaluate the approach using the OpenCV version of the Viola-Jones face detector. After accounting for all computational overhead introduced by the fixation controller, the approach doubles the speed of the standard Viola-Jones detector at little cost in accuracy.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122784994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 110
Robustifying eye center localization by head pose cues 头部姿势线索增强眼中心定位
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206640
R. Valenti, Zeynep Yücel, T. Gevers
{"title":"Robustifying eye center localization by head pose cues","authors":"R. Valenti, Zeynep Yücel, T. Gevers","doi":"10.1109/CVPR.2009.5206640","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206640","url":null,"abstract":"Head pose and eye location estimation are two closely related issues which refer to similar application areas. In recent years, these problems have been studied individually in numerous works in the literature. Previous research shows that cylindrical head models and isophote based schemes provide satisfactory precision in head pose and eye location estimation, respectively. However, the eye locator is not adequate to accurately locate eye in the presence of extreme head poses. Therefore, head pose cues may be suited to enhance the accuracy of eye localization in the presence of severe head poses. In this paper, a hybrid scheme is proposed in which the transformation matrix obtained from the head pose is used to normalize the eye regions and, in turn the transformation matrix generated by the found eye location is used to correct the pose estimation procedure. The scheme is designed to (1) enhance the accuracy of eye location estimations in low resolution videos, (2) to extend the operating range of the eye locator and (3) to improve the accuracy and re-initialization capabilities of the pose tracker. From the experimental results it can be derived that the proposed unified scheme improves the accuracy of eye estimations by 16% to 23%. Further, it considerably extends its operating range by more than 15°, by overcoming the problems introduced by extreme head poses. Finally, the accuracy of the head pose tracker is improved by 12% to 24%.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132528248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 59
Geometric reasoning for single image structure recovery 单幅图像结构恢复的几何推理
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206872
David C. Lee, M. Hebert, T. Kanade
{"title":"Geometric reasoning for single image structure recovery","authors":"David C. Lee, M. Hebert, T. Kanade","doi":"10.1109/CVPR.2009.5206872","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206872","url":null,"abstract":"We study the problem of generating plausible interpretations of a scene from a collection of line segments automatically extracted from a single indoor image. We show that we can recognize the three dimensional structure of the interior of a building, even in the presence of occluding objects. Several physically valid structure hypotheses are proposed by geometric reasoning and verified to find the best fitting model to line segments, which is then converted to a full 3D model. Our experiments demonstrate that our structure recovery from line segments is comparable with methods using full image appearance. Our approach shows how a set of rules describing geometric constraints between groups of segments can be used to prune scene interpretation hypotheses and to generate the most plausible interpretation.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132581951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 467
An empirical Bayes approach to contextual region classification 上下文区域分类的经验贝叶斯方法
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206690
S. Lazebnik, M. Raginsky
{"title":"An empirical Bayes approach to contextual region classification","authors":"S. Lazebnik, M. Raginsky","doi":"10.1109/CVPR.2009.5206690","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206690","url":null,"abstract":"This paper presents a nonparametric approach to labeling of local image regions that is inspired by recent developments in information-theoretic denoising. The chief novelty of this approach rests in its ability to derive an unsupervised contextual prior over image classes from unlabeled test data. Labeled training data is needed only to learn a local appearance model for image patches (although additional supervisory information can optionally be incorporated when it is available). Instead of assuming a parametric prior such as a Markov random field for the class labels, the proposed approach uses the empirical Bayes technique of statistical inversion to recover a contextual model directly from the test data, either as a spatially varying or as a globally constant prior distribution over the classes in the image. Results on two challenging datasets convincingly demonstrate that useful contextual information can indeed be learned from unlabeled data.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134234300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Robust unsupervised segmentation of degraded document images with topic models 基于主题模型的退化文档图像鲁棒无监督分割
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206606
Timothy J. Burns, Jason J. Corso
{"title":"Robust unsupervised segmentation of degraded document images with topic models","authors":"Timothy J. Burns, Jason J. Corso","doi":"10.1109/CVPR.2009.5206606","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206606","url":null,"abstract":"Segmentation of document images remains a challenging vision problem. Although document images have a structured layout, capturing enough of it for segmentation can be difficult. Most current methods combine text extraction and heuristics for segmentation, but text extraction is prone to failure and measuring accuracy remains a difficult challenge. Furthermore, when presented with significant degradation many common heuristic methods fall apart. In this paper, we propose a Bayesian generative model for document images which seeks to overcome some of these drawbacks. Our model automatically discovers different regions present in a document image in a completely unsupervised fashion. We attempt no text extraction, but rather use discrete patch-based codebook learning to make our probabilistic representation feasible. Each latent region topic is a distribution over these patch indices. We capture rough document layout with an MRF Potts model. We take an analysis by synthesis approach to examine the model, and provide quantitative segmentation results on a manually labeled document image data set. We illustrate our model's robustness by providing results on a highly degraded version of our test set.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133820220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
LidarBoost: Depth superresolution for ToF 3D shape scanning LidarBoost:用于ToF 3D形状扫描的深度超分辨率
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206804
Sebastian Schuon, C. Theobalt, James Davis, S. Thrun
{"title":"LidarBoost: Depth superresolution for ToF 3D shape scanning","authors":"Sebastian Schuon, C. Theobalt, James Davis, S. Thrun","doi":"10.1109/CVPR.2009.5206804","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206804","url":null,"abstract":"Depth maps captured with time-of-flight cameras have very low data quality: the image resolution is rather limited and the level of random noise contained in the depth maps is very high. Therefore, such flash lidars cannot be used out of the box for high-quality 3D object scanning. To solve this problem, we present LidarBoost, a 3D depth superresolution method that combines several low resolution noisy depth images of a static scene from slightly displaced viewpoints, and merges them into a high-resolution depth image. We have developed an optimization framework that uses a data fidelity term and a geometry prior term that is tailored to the specific characteristics of flash lidars. We demonstrate both visually and quantitatively that LidarBoost produces better results than previous methods from the literature.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115431083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 230
Illumination and spatially varying specular reflectance from a single view 单个视图的光照和空间变化的镜面反射率
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206764
K. Hara, K. Nishino
{"title":"Illumination and spatially varying specular reflectance from a single view","authors":"K. Hara, K. Nishino","doi":"10.1109/CVPR.2009.5206764","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206764","url":null,"abstract":"Estimating the illumination and the reflectance properties of an object surface from a sparse set of images is an important but inherently ill-posed problem. The problem becomes even harder if we wish to account for the spatial variation of material properties on the surface. In this paper, we derive a novel method for estimating the spatially varying specular reflectance properties, of a surface of known geometry, as well as the illumination distribution from a specular-only image, for instance, captured using polarization to separate reflection components. Unlike previous work, we do not assume the illumination to be a single point light source. We model specular reflection with a spherical statistical distribution and encode the spatial variation with radial basis functions of its parameters. This allows us to formulate the simultaneous estimation of spatially varying specular reflectance and illumination as a sound probabilistic inference problem, in particular, using Csiszar's I-divergence measure. To solve it, we derive an iterative algorithm similar to expectation maximization. We demonstrate the effectiveness of the method on synthetic and real-world scenes.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115439556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Simultaneous image classification and annotation 同时进行图像分类和标注
2009 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2009-06-20 DOI: 10.1109/CVPR.2009.5206800
Chong Wang, D. Blei, Li Fei-Fei
{"title":"Simultaneous image classification and annotation","authors":"Chong Wang, D. Blei, Li Fei-Fei","doi":"10.1109/CVPR.2009.5206800","DOIUrl":"https://doi.org/10.1109/CVPR.2009.5206800","url":null,"abstract":"Image classification and annotation are important problems in computer vision, but rarely considered together. Intuitively, annotations provide evidence for the class label, and the class label provides evidence for annotations. For example, an image of class highway is more likely annotated with words “road,” “car,” and “traffic” than words “fish,” “boat,” and “scuba.” In this paper, we develop a new probabilistic model for jointly modeling the image, its class label, and its annotations. Our model treats the class label as a global description of the image, and treats annotation terms as local descriptions of parts of the image. Its underlying probabilistic assumptions naturally integrate these two sources of information. We derive an approximate inference and estimation algorithms based on variational methods, as well as efficient approximations for classifying and annotating new images. We examine the performance of our model on two real-world image data sets, illustrating that a single model provides competitive annotation performance, and superior classification performance.","PeriodicalId":386532,"journal":{"name":"2009 IEEE Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2009-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115540802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 612
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信