2012 IEEE Conference on Computer Vision and Pattern Recognition最新文献

筛选
英文 中文
Discovering important people and objects for egocentric video summarization 发现以自我为中心的视频总结的重要人物和对象
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247820
Yong Jae Lee, Joydeep Ghosh, K. Grauman
{"title":"Discovering important people and objects for egocentric video summarization","authors":"Yong Jae Lee, Joydeep Ghosh, K. Grauman","doi":"10.1109/CVPR.2012.6247820","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247820","url":null,"abstract":"We developed an approach to summarize egocentric video. We introduced novel egocentric features to train a regressor that predicts important regions. Using the discovered important regions, our approach produces significantly more informative summaries than traditional methods that often include irrelevant or redundant information.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126857707","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 709
Computer vision aided target linked radiation imaging 计算机视觉辅助靶联辐射成像
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247797
Dashan Gao, Yi Yao, Fengjian Pan, Ting Yu, Ting Yu, Li Guan, Walt Dixon, B. Yanoff, Tai-Peng Tian, N. Krahnstoever
{"title":"Computer vision aided target linked radiation imaging","authors":"Dashan Gao, Yi Yao, Fengjian Pan, Ting Yu, Ting Yu, Li Guan, Walt Dixon, B. Yanoff, Tai-Peng Tian, N. Krahnstoever","doi":"10.1109/CVPR.2012.6247797","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247797","url":null,"abstract":"In this paper, we demonstrated an application of video tracking to radiation detection, where a vision-based tracking system enables a traditional CZT (cadmium zinc telluride)-based radiation imaging device to detect radioactive targets that are in motion. An integrated real-time system consisting of multiple fixed cameras and radiation detectors was implemented and tested. The multi-camera tracking system combines multiple feature cues (such as silhouette, appearance, and geometry) from different viewing angles to ensure consistent target identities under challenging tracking conditions. Experimental results show that both the video tracking and the integrated systems perform accurately and persistently under various scenarios involving multiple vehicles, driving speeds, and driving patterns. The results also validate and reiterate the importance of video tracking as an enabling technology in the field of radiation imaging.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116032396","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Efficient object detection using cascades of nearest convex model classifiers 使用最近凸模型分类器级联的有效目标检测
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6248047
Hakan Cevikalp, B. Triggs
{"title":"Efficient object detection using cascades of nearest convex model classifiers","authors":"Hakan Cevikalp, B. Triggs","doi":"10.1109/CVPR.2012.6248047","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6248047","url":null,"abstract":"An object detector must detect and localize each instance of the object class of interest in the image. Many recent detectors adopt a sliding window approach, reducing the problem to one of deciding whether the detection window currently contains a valid object instance or background. Machine learning based discriminants such as SVM and boosting are typically used for this, often in the form of classifier cascades to allow more rapid rejection of easy negatives. We argue that “one class” methods - ones that focus mainly on modelling the range of the positive class - are a useful alternative to binary discriminants in such applications, particularly in the early stages of the cascade where one-class approaches may allow simpler classifiers and faster rejection. We implement this in the form of a short cascade of efficient nearest-convex-model one-class classifiers, starting with linear distance-to-affine-hyperplane and interior-of-hypersphere classifiers and finishing with kernelized hypersphere classifiers. We show that our methods have very competitive performance on the Faces in the Wild and ESOGU face detection datasets and state-of-the-art performance on the INRIA Person dataset. As predicted, the one-class formulations provide significant reductions in classifier complexity relative to the corresponding two-class ones.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116057224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 60
A study on human age estimation under facial expression changes 基于面部表情变化的人类年龄估计研究
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247972
G. Guo, Xiaolong Wang
{"title":"A study on human age estimation under facial expression changes","authors":"G. Guo, Xiaolong Wang","doi":"10.1109/CVPR.2012.6247972","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247972","url":null,"abstract":"In this paper, we study human age estimation in face images under significant expression changes. We will address two issues: (1) Is age estimation affected by facial expression changes and how significant is the influence? (2) How to develop a robust method to perform age estimation undergoing various facial expression changes? This systematic study will not only discover the relation between age estimation and expression changes, but also contribute a robust solution to solve the problem of cross-expression age estimation. This study is an important step towards developing a practical and robust age estimation system that allows users to present their faces naturally (with various expressions) rather than constrained to the neutral expression only. Two databases originally captured in the Psychology community are introduced to Computer Vision, to quantitatively demonstrate the influence of expression changes on age estimation, and evaluate the proposed framework and corresponding methods for cross-expression age estimation.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116518625","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 97
Large-scale image classification with trace-norm regularization 基于跟踪范数正则化的大规模图像分类
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6248078
Zaïd Harchaoui, Matthijs Douze, Mattis Paulin, Miroslav Dudík, J. Malick
{"title":"Large-scale image classification with trace-norm regularization","authors":"Zaïd Harchaoui, Matthijs Douze, Mattis Paulin, Miroslav Dudík, J. Malick","doi":"10.1109/CVPR.2012.6248078","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6248078","url":null,"abstract":"With the advent of larger image classification datasets such as ImageNet, designing scalable and efficient multi-class classification algorithms is now an important challenge. We introduce a new scalable learning algorithm for large-scale multi-class image classification, based on the multinomial logistic loss and the trace-norm regularization penalty. Reframing the challenging non-smooth optimization problem into a surrogate infinite-dimensional optimization problem with a regular ℓ1-regularization penalty, we propose a simple and provably efficient accelerated coordinate descent algorithm. Furthermore, we show how to perform efficient matrix computations in the compressed domain for quantized dense visual features, scaling up to 100,000s examples, 1,000s-dimensional features, and 100s of categories. Promising experimental results on the \"Fungus\", \"Ungulate\", and \"Vehicles\" subsets of ImageNet are presented, where we show that our approach performs significantly better than state-of-the-art approaches for Fisher vectors with 16 Gaussians.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"374 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116520716","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis 半耦合字典学习及其在图像超分辨率和照片草图合成中的应用
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247930
Shenlong Wang, Lei Zhang, Yan Liang, Q. Pan
{"title":"Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis","authors":"Shenlong Wang, Lei Zhang, Yan Liang, Q. Pan","doi":"10.1109/CVPR.2012.6247930","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247930","url":null,"abstract":"In various computer vision applications, often we need to convert an image in one style into another style for better visualization, interpretation and recognition; for examples, up-convert a low resolution image to a high resolution one, and convert a face sketch into a photo for matching, etc. A semi-coupled dictionary learning (SCDL) model is proposed in this paper to solve such cross-style image synthesis problems. Under SCDL, a pair of dictionaries and a mapping function will be simultaneously learned. The dictionary pair can well characterize the structural domains of the two styles of images, while the mapping function can reveal the intrinsic relationship between the two styles' domains. In SCDL, the two dictionaries will not be fully coupled, and hence much flexibility can be given to the mapping function for an accurate conversion across styles. Moreover, clustering and image nonlocal redundancy are introduced to enhance the robustness of SCDL. The proposed SCDL model is applied to image super-resolution and photo-sketch synthesis, and the experimental results validated its generality and effectiveness in cross-style image synthesis.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122975378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 567
Sum-product networks for modeling activities with stochastic structure 随机结构活动建模的和积网络
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247816
Mohamed R. Amer, S. Todorovic
{"title":"Sum-product networks for modeling activities with stochastic structure","authors":"Mohamed R. Amer, S. Todorovic","doi":"10.1109/CVPR.2012.6247816","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247816","url":null,"abstract":"This paper addresses recognition of human activities with stochastic structure, characterized by variable spacetime arrangements of primitive actions, and conducted by a variable number of actors. We demonstrate that modeling aggregate counts of visual words is surprisingly expressive enough for such a challenging recognition task. An activity is represented by a sum-product network (SPN). SPN is a mixture of bags-of-words (BoWs) with exponentially many mixture components, where subcomponents are reused by larger ones. SPN consists of terminal nodes representing BoWs, and product and sum nodes organized in a number of layers. The products are aimed at encoding particular configurations of primitive actions, and the sums serve to capture their alternative configurations. The connectivity of SPN and parameters of BoW distributions are learned under weak supervision using the EM algorithm. SPN inference amounts to parsing the SPN graph, which yields the most probable explanation (MPE) of the video in terms of activity detection and localization. SPN inference has linear complexity in the number of nodes, under fairly general conditions, enabling fast and scalable recognition. A new Volleyball dataset is compiled and annotated for evaluation. Our classification accuracy and localization precision and recall are superior to those of the state-of-the-art on the benchmark and our Volleyball datasets.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114454818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Fan Shape Model for object detection 用于目标检测的扇形模型
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247670
Xinggang Wang, X. Bai, Tianyang Ma, Wenyu Liu, Longin Jan Latecki
{"title":"Fan Shape Model for object detection","authors":"Xinggang Wang, X. Bai, Tianyang Ma, Wenyu Liu, Longin Jan Latecki","doi":"10.1109/CVPR.2012.6247670","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247670","url":null,"abstract":"We propose a novel shape model for object detection called Fan Shape Model (FSM). We model contour sample points as rays of final length emanating for a reference point. As in folding fan, its slats, which we call rays, are very flexible. This flexibility allows FSM to tolerate large shape variance. However, the order and the adjacency relation of the slats stay invariant during fan deformation, since the slats are connected with a thin fabric. In analogy, we enforce the order and adjacency relation of the rays to stay invariant during the deformation. Therefore, FSM preserves discriminative power while allowing for a substantial shape deformation. FSM allows also for precise scale estimation during object detection. Thus, there is not need to scale the shape model or image in order to perform object detection. Another advantage of FSM is the fact that it can be applied directly to edge images, since it does not require any linking of edge pixels to edge fragments (contours).","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114572628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Saliency-guided integration of multiple scans 显著性引导的多重扫描集成
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247836
R. Song, Yonghuai Liu, Ralph Robert Martin, Paul L. Rosin
{"title":"Saliency-guided integration of multiple scans","authors":"R. Song, Yonghuai Liu, Ralph Robert Martin, Paul L. Rosin","doi":"10.1109/CVPR.2012.6247836","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247836","url":null,"abstract":"We present a novel method to integrate multiple 3D scans captured from different viewpoints. Saliency information is used to guide the integration process. The multi-scale saliency of a point is specifically designed to reflect its sensitivity to registration errors. Then scans are partitioned into salient and non-salient regions through an Markov Random Field (MRF) framework where neighbourhood consistency is incorporated to increase the robustness against potential scanning errors. We then develop different schemes to discriminatively integrate points in the two regions. For the points in salient regions which are more sensitive to registration errors, we employ the Iterative Closest Point algorithm to compensate the local registration error and find the correspondences for the integration. For the points in non-salient regions which are less sensitive to registration errors, we integrate them via an efficient and effective point-shifting scheme. A comparative study shows that the proposed method delivers improved surface integration.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129552375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Discovering discriminative action parts from mid-level video representations 从中级视频表现中发现有区别的动作部分
2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247807
Michalis Raptis, Iasonas Kokkinos, Stefano Soatto
{"title":"Discovering discriminative action parts from mid-level video representations","authors":"Michalis Raptis, Iasonas Kokkinos, Stefano Soatto","doi":"10.1109/CVPR.2012.6247807","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247807","url":null,"abstract":"We describe a mid-level approach for action recognition. From an input video, we extract salient spatio-temporal structures by forming clusters of trajectories that serve as candidates for the parts of an action. The assembly of these clusters into an action class is governed by a graphical model that incorporates appearance and motion constraints for the individual parts and pairwise constraints for the spatio-temporal dependencies among them. During training, we estimate the model parameters discriminatively. During classification, we efficiently match the model to a video using discrete optimization. We validate the model's classification ability in standard benchmark datasets and illustrate its potential to support a fine-grained analysis that not only gives a label to a video, but also identifies and localizes its constituent parts.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128208153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 251
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信