2013 IEEE Conference on Computer Vision and Pattern Recognition最新文献

筛选
英文 中文
Optimized Product Quantization for Approximate Nearest Neighbor Search 近似最近邻搜索的优化产品量化
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.379
T. Ge, Kaiming He, Qifa Ke, Jian Sun
{"title":"Optimized Product Quantization for Approximate Nearest Neighbor Search","authors":"T. Ge, Kaiming He, Qifa Ke, Jian Sun","doi":"10.1109/CVPR.2013.379","DOIUrl":"https://doi.org/10.1109/CVPR.2013.379","url":null,"abstract":"Product quantization is an effective vector quantization approach to compactly encode high-dimensional vectors for fast approximate nearest neighbor (ANN) search. The essence of product quantization is to decompose the original high-dimensional space into the Cartesian product of a finite number of low-dimensional subspaces that are then quantized separately. Optimal space decomposition is important for the performance of ANN search, but still remains unaddressed. In this paper, we optimize product quantization by minimizing quantization distortions w.r.t. the space decomposition and the quantization codebooks. We present two novel methods for optimization: a non-parametric method that alternatively solves two smaller sub-problems, and a parametric method that is guaranteed to achieve the optimal solution if the input data follows some Gaussian distribution. We show by experiments that our optimized approach substantially improves the accuracy of product quantization for ANN search.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"11 1","pages":"2946-2953"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90436059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 367
Submodular Salient Region Detection 子模显著区检测
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.266
Zhuolin Jiang, L. Davis
{"title":"Submodular Salient Region Detection","authors":"Zhuolin Jiang, L. Davis","doi":"10.1109/CVPR.2013.266","DOIUrl":"https://doi.org/10.1109/CVPR.2013.266","url":null,"abstract":"The problem of salient region detection is formulated as the well-studied facility location problem from operations research. High-level priors are combined with low-level features to detect salient regions. Salient region detection is achieved by maximizing a sub modular objective function, which maximizes the total similarities (i.e., total profits) between the hypothesized salient region centers (i.e., facility locations) and their region elements (i.e., clients), and penalizes the number of potential salient regions (i.e., the number of open facilities). The similarities are efficiently computed by finding a closed-form harmonic solution on the constructed graph for an input image. The saliency of a selected region is modeled in terms of appearance and spatial location. By exploiting the sub modularity properties of the objective function, a highly efficient greedy-based optimization algorithm can be employed. This algorithm is guaranteed to be at least a (e - 1)/e 0.632-approximation to the optimum. Experimental results demonstrate that our approach outperforms several recently proposed saliency detection approaches.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"38 1","pages":"2043-2050"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86068549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 154
Complex Event Detection via Multi-source Video Attributes 基于多源视频属性的复杂事件检测
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.339
Zhigang Ma, Yi Yang, Zhongwen Xu, Shuicheng Yan, N. Sebe, Alexander Hauptmann
{"title":"Complex Event Detection via Multi-source Video Attributes","authors":"Zhigang Ma, Yi Yang, Zhongwen Xu, Shuicheng Yan, N. Sebe, Alexander Hauptmann","doi":"10.1109/CVPR.2013.339","DOIUrl":"https://doi.org/10.1109/CVPR.2013.339","url":null,"abstract":"Complex events essentially include human, scenes, objects and actions that can be summarized by visual attributes, so leveraging relevant attributes properly could be helpful for event detection. Many works have exploited attributes at image level for various applications. However, attributes at image level are possibly insufficient for complex event detection in videos due to their limited capability in characterizing the dynamic properties of video data. Hence, we propose to leverage attributes at video level (named as video attributes in this work), i.e., the semantic labels of external videos are used as attributes. Compared to complex event videos, these external videos contain simple contents such as objects, scenes and actions which are the basic elements of complex events. Specifically, building upon a correlation vector which correlates the attributes and the complex event, we incorporate video attributes latently as extra informative cues into the event detector learnt from complex event videos. Extensive experiments on a real-world large-scale dataset validate the efficacy of the proposed approach.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"27 1","pages":"2627-2633"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81301777","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
Detection Evolution with Multi-order Contextual Co-occurrence 多阶上下文共现的检测进化
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.235
Guang Chen, Yuanyuan Ding, Jing Xiao, T. Han
{"title":"Detection Evolution with Multi-order Contextual Co-occurrence","authors":"Guang Chen, Yuanyuan Ding, Jing Xiao, T. Han","doi":"10.1109/CVPR.2013.235","DOIUrl":"https://doi.org/10.1109/CVPR.2013.235","url":null,"abstract":"Context has been playing an increasingly important role to improve the object detection performance. In this paper we propose an effective representation, Multi-Order Contextual co-Occurrence (MOCO), to implicitly model the high level context using solely detection responses from a baseline object detector. The so-called (1st-order) context feature is computed as a set of randomized binary comparisons on the response map of the baseline object detector. The statistics of the 1st-order binary context features are further calculated to construct a high order co-occurrence descriptor. Combining the MOCO feature with the original image feature, we can evolve the baseline object detector to a stronger context aware detector. With the updated detector, we can continue the evolution till the contextual improvements saturate. Using the successful deformable-part-model detector [13] as the baseline detector, we test the proposed MOCO evolution framework on the PASCAL VOC 2007 dataset [8] and Caltech pedestrian dataset [7]: The proposed MOCO detector outperforms all known state-of-the-art approaches, contextually boosting deformable part models (ver. 5) [13] by 3.3% in mean average precision on the PASCAL 2007 dataset. For the Caltech pedestrian dataset, our method further reduces the log-average miss rate from 48% to 46% and the miss rate at 1 FPPI from 25% to 23%, compared with the best prior art [6].","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"4 1","pages":"1798-1805"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81398816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 93
Supervised Semantic Gradient Extraction Using Linear-Time Optimization 基于线性时间优化的监督语义梯度提取
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.364
Shulin Yang, Jue Wang, L. Shapiro
{"title":"Supervised Semantic Gradient Extraction Using Linear-Time Optimization","authors":"Shulin Yang, Jue Wang, L. Shapiro","doi":"10.1109/CVPR.2013.364","DOIUrl":"https://doi.org/10.1109/CVPR.2013.364","url":null,"abstract":"This paper proposes a new supervised semantic edge and gradient extraction approach, which allows the user to roughly scribble over the desired region to extract semantically-dominant and coherent edges in it. Our approach first extracts low-level edge lets (small edge clusters) from the input image as primitives and build a graph upon them, by jointly considering both the geometric and appearance compatibility of edge lets. Given the characteristics of the graph, it cannot be effectively optimized by commonly-used energy minimization tools such as graph cuts. We thus propose an efficient linear algorithm for precise graph optimization, by taking advantage of the special structure of the graph. %Optimal parameter settings of the model are learnt from a dataset. Objective evaluations show that the proposed method significantly outperforms previous semantic edge detection algorithms. Finally, we demonstrate the effectiveness of the system in various image editing tasks.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"45 1","pages":"2826-2833"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81441017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Learning Binary Codes for High-Dimensional Data Using Bilinear Projections 使用双线性投影学习高维数据的二进制代码
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.69
Yunchao Gong, Sanjiv Kumar, H. Rowley, S. Lazebnik
{"title":"Learning Binary Codes for High-Dimensional Data Using Bilinear Projections","authors":"Yunchao Gong, Sanjiv Kumar, H. Rowley, S. Lazebnik","doi":"10.1109/CVPR.2013.69","DOIUrl":"https://doi.org/10.1109/CVPR.2013.69","url":null,"abstract":"Recent advances in visual recognition indicate that to achieve good retrieval and classification accuracy on large-scale datasets like Image Net, extremely high-dimensional visual descriptors, e.g., Fisher Vectors, are needed. We present a novel method for converting such descriptors to compact similarity-preserving binary codes that exploits their natural matrix structure to reduce their dimensionality using compact bilinear projections instead of a single large projection matrix. This method achieves comparable retrieval and classification accuracy to the original descriptors and to the state-of-the-art Product Quantization approach while having orders of magnitude faster code generation time and smaller memory footprint.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"36 1","pages":"484-491"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87282451","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 185
Robust Region Grouping via Internal Patch Statistics 基于内部补丁统计的鲁棒区域分组
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.252
Xiaobai Liu, Liang Lin, A. Yuille
{"title":"Robust Region Grouping via Internal Patch Statistics","authors":"Xiaobai Liu, Liang Lin, A. Yuille","doi":"10.1109/CVPR.2013.252","DOIUrl":"https://doi.org/10.1109/CVPR.2013.252","url":null,"abstract":"In this work, we present an efficient multi-scale low-rank representation for image segmentation. Our method begins with partitioning the input images into a set of super pixels, followed by seeking the optimal super pixel-pair affinity matrix, both of which are performed at multiple scales of the input images. Since low-level super pixel features are usually corrupted by image noises, we propose to infer the low-rank refined affinity matrix. The inference is guided by two observations on natural images. First, looking into a single image, local small-size image patterns tend to recur frequently within the same semantic region, but may not appear in semantically different regions. We call this internal image statistics as replication prior, and quantitatively justify it on real image databases. Second, the affinity matrices at different scales should be consistently solved, which leads to the cross-scale consistency constraint. We formulate these two purposes with one unified formulation and develop an efficient optimization procedure. Our experiments demonstrate the presented method can substantially improve segmentation accuracy.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"48 1","pages":"1931-1938"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84304672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Fine-Grained Crowdsourcing for Fine-Grained Recognition 细粒度众包实现细粒度识别
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.81
Jia Deng, J. Krause, Li Fei-Fei
{"title":"Fine-Grained Crowdsourcing for Fine-Grained Recognition","authors":"Jia Deng, J. Krause, Li Fei-Fei","doi":"10.1109/CVPR.2013.81","DOIUrl":"https://doi.org/10.1109/CVPR.2013.81","url":null,"abstract":"Fine-grained recognition concerns categorization at sub-ordinate levels, where the distinction between object classes is highly local. Compared to basic level recognition, fine-grained categorization can be more challenging as there are in general less data and fewer discriminative features. This necessitates the use of stronger prior for feature selection. In this work, we include humans in the loop to help computers select discriminative features. We introduce a novel online game called \"Bubbles\" that reveals discriminative features humans use. The player's goal is to identify the category of a heavily blurred image. During the game, the player can choose to reveal full details of circular regions (\"bubbles\"), with a certain penalty. With proper setup the game generates discriminative bubbles with assured quality. We next propose the \"Bubble Bank\" algorithm that uses the human selected bubbles to improve machine recognition performance. Experiments demonstrate that our approach yields large improvements over the previous state of the art on challenging benchmarks.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"112 1","pages":"580-587"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88309775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 300
Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context 基于自适应特征关联和语义上下文的非参数场景解析
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.405
Gautam Singh, J. Kosecka
{"title":"Nonparametric Scene Parsing with Adaptive Feature Relevance and Semantic Context","authors":"Gautam Singh, J. Kosecka","doi":"10.1109/CVPR.2013.405","DOIUrl":"https://doi.org/10.1109/CVPR.2013.405","url":null,"abstract":"This paper presents a nonparametric approach to semantic parsing using small patches and simple gradient, color and location features. We learn the relevance of individual feature channels at test time using a locally adaptive distance metric. To further improve the accuracy of the nonparametric approach, we examine the importance of the retrieval set used to compute the nearest neighbours using a novel semantic descriptor to retrieve better candidates. The approach is validated by experiments on several datasets used for semantic parsing demonstrating the superiority of the method compared to the state of art approaches.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"27 1","pages":"3151-3157"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88231380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 97
Adherent Raindrop Detection and Removal in Video 视频中附着雨滴的检测和去除
2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.138
Shaodi You, R. Tan, Rei Kawakami, K. Ikeuchi
{"title":"Adherent Raindrop Detection and Removal in Video","authors":"Shaodi You, R. Tan, Rei Kawakami, K. Ikeuchi","doi":"10.1109/CVPR.2013.138","DOIUrl":"https://doi.org/10.1109/CVPR.2013.138","url":null,"abstract":"Raindrops adhered to a windscreen or window glass can significantly degrade the visibility of a scene. Detecting and removing raindrops will, therefore, benefit many computer vision applications, particularly outdoor surveillance systems and intelligent vehicle systems. In this paper, a method that automatically detects and removes adherent raindrops is introduced. The core idea is to exploit the local spatio-temporal derivatives of raindrops. First, it detects raindrops based on the motion and the intensity temporal derivatives of the input video. Second, relying on an analysis that some areas of a raindrop completely occludes the scene, yet the remaining areas occludes only partially, the method removes the two types of areas separately. For partially occluding areas, it restores them by retrieving as much as possible information of the scene, namely, by solving a blending function on the detected partially occluding areas using the temporal intensity change. For completely occluding areas, it recovers them by using a video completion technique. Experimental results using various real videos show the effectiveness of the proposed method.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"20 1","pages":"1035-1042"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85602857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信