2012 IEEE Conference on Computer Vision and Pattern Recognition最新文献_第2页

Discovering important people and objects for egocentric video summarization 发现以自我为中心的视频总结的重要人物和对象

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247820

Yong Jae Lee, Joydeep Ghosh, K. Grauman

引用次数: 709

Computer vision aided target linked radiation imaging 计算机视觉辅助靶联辐射成像

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247797

Dashan Gao, Yi Yao, Fengjian Pan, Ting Yu, Ting Yu, Li Guan, Walt Dixon, B. Yanoff, Tai-Peng Tian, N. Krahnstoever

引用次数: 5

Efficient object detection using cascades of nearest convex model classifiers 使用最近凸模型分类器级联的有效目标检测

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6248047

Hakan Cevikalp, B. Triggs

{"title":"Efficient object detection using cascades of nearest convex model classifiers","authors":"Hakan Cevikalp, B. Triggs","doi":"10.1109/CVPR.2012.6248047","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6248047","url":null,"abstract":"An object detector must detect and localize each instance of the object class of interest in the image. Many recent detectors adopt a sliding window approach, reducing the problem to one of deciding whether the detection window currently contains a valid object instance or background. Machine learning based discriminants such as SVM and boosting are typically used for this, often in the form of classifier cascades to allow more rapid rejection of easy negatives. We argue that “one class” methods - ones that focus mainly on modelling the range of the positive class - are a useful alternative to binary discriminants in such applications, particularly in the early stages of the cascade where one-class approaches may allow simpler classifiers and faster rejection. We implement this in the form of a short cascade of efficient nearest-convex-model one-class classifiers, starting with linear distance-to-affine-hyperplane and interior-of-hypersphere classifiers and finishing with kernelized hypersphere classifiers. We show that our methods have very competitive performance on the Faces in the Wild and ESOGU face detection datasets and state-of-the-art performance on the INRIA Person dataset. As predicted, the one-class formulations provide significant reductions in classifier complexity relative to the corresponding two-class ones.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116057224","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 60

A study on human age estimation under facial expression changes 基于面部表情变化的人类年龄估计研究

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247972

G. Guo, Xiaolong Wang

引用次数: 97

Large-scale image classification with trace-norm regularization 基于跟踪范数正则化的大规模图像分类

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6248078

Zaïd Harchaoui, Matthijs Douze, Mattis Paulin, Miroslav Dudík, J. Malick

引用次数: 113

Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis 半耦合字典学习及其在图像超分辨率和照片草图合成中的应用

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247930

Shenlong Wang, Lei Zhang, Yan Liang, Q. Pan

{"title":"Semi-coupled dictionary learning with applications to image super-resolution and photo-sketch synthesis","authors":"Shenlong Wang, Lei Zhang, Yan Liang, Q. Pan","doi":"10.1109/CVPR.2012.6247930","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247930","url":null,"abstract":"In various computer vision applications, often we need to convert an image in one style into another style for better visualization, interpretation and recognition; for examples, up-convert a low resolution image to a high resolution one, and convert a face sketch into a photo for matching, etc. A semi-coupled dictionary learning (SCDL) model is proposed in this paper to solve such cross-style image synthesis problems. Under SCDL, a pair of dictionaries and a mapping function will be simultaneously learned. The dictionary pair can well characterize the structural domains of the two styles of images, while the mapping function can reveal the intrinsic relationship between the two styles' domains. In SCDL, the two dictionaries will not be fully coupled, and hence much flexibility can be given to the mapping function for an accurate conversion across styles. Moreover, clustering and image nonlocal redundancy are introduced to enhance the robustness of SCDL. The proposed SCDL model is applied to image super-resolution and photo-sketch synthesis, and the experimental results validated its generality and effectiveness in cross-style image synthesis.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122975378","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 567

Sum-product networks for modeling activities with stochastic structure 随机结构活动建模的和积网络

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247816

Mohamed R. Amer, S. Todorovic

{"title":"Sum-product networks for modeling activities with stochastic structure","authors":"Mohamed R. Amer, S. Todorovic","doi":"10.1109/CVPR.2012.6247816","DOIUrl":"https://doi.org/10.1109/CVPR.2012.6247816","url":null,"abstract":"This paper addresses recognition of human activities with stochastic structure, characterized by variable spacetime arrangements of primitive actions, and conducted by a variable number of actors. We demonstrate that modeling aggregate counts of visual words is surprisingly expressive enough for such a challenging recognition task. An activity is represented by a sum-product network (SPN). SPN is a mixture of bags-of-words (BoWs) with exponentially many mixture components, where subcomponents are reused by larger ones. SPN consists of terminal nodes representing BoWs, and product and sum nodes organized in a number of layers. The products are aimed at encoding particular configurations of primitive actions, and the sums serve to capture their alternative configurations. The connectivity of SPN and parameters of BoW distributions are learned under weak supervision using the EM algorithm. SPN inference amounts to parsing the SPN graph, which yields the most probable explanation (MPE) of the video in terms of activity detection and localization. SPN inference has linear complexity in the number of nodes, under fairly general conditions, enabling fast and scalable recognition. A new Volleyball dataset is compiled and annotated for evaluation. Our classification accuracy and localization precision and recall are superior to those of the state-of-the-art on the benchmark and our Volleyball datasets.","PeriodicalId":177454,"journal":{"name":"2012 IEEE Conference on Computer Vision and Pattern Recognition","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114454818","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Fan Shape Model for object detection 用于目标检测的扇形模型

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247670

Xinggang Wang, X. Bai, Tianyang Ma, Wenyu Liu, Longin Jan Latecki

引用次数: 42

Saliency-guided integration of multiple scans 显著性引导的多重扫描集成

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247836

R. Song, Yonghuai Liu, Ralph Robert Martin, Paul L. Rosin

引用次数: 10

Discovering discriminative action parts from mid-level video representations 从中级视频表现中发现有区别的动作部分

2012 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2012-06-16 DOI: 10.1109/CVPR.2012.6247807

Michalis Raptis, Iasonas Kokkinos, Stefano Soatto

引用次数: 251