2014 IEEE Conference on Computer Vision and Pattern Recognition最新文献_第7页

Tracklet Association with Online Target-Specific Metric Learning Tracklet与在线目标特定度量学习的关联

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.161

B. Wang, G. Wang, K. Chan, Li Wang

{"title":"Tracklet Association with Online Target-Specific Metric Learning","authors":"B. Wang, G. Wang, K. Chan, Li Wang","doi":"10.1109/CVPR.2014.161","DOIUrl":"https://doi.org/10.1109/CVPR.2014.161","url":null,"abstract":"This paper presents a novel introduction of online target-specific metric learning in track fragment (tracklet) association by network flow optimization for long-term multi-person tracking. Different from other network flow formulation, each node in our network represents a tracklet, and each edge represents the likelihood of neighboring tracklets belonging to the same trajectory as measured by our proposed affinity score. In our method, target-specific similarity metrics are learned, which give rise to the appearance-based models used in the tracklet affinity estimation. Trajectory-based tracklets are refined by using the learned metrics to account for appearance consistency and to identify reliable tracklets. The metrics are then re-learned using reliable tracklets for computing tracklet affinity scores. Long-term trajectories are then obtained through network flow optimization. Occlusions and missed detections are handled by a trajectory completion step. Our method is effective for long-term tracking even when the targets are spatially close or completely occluded by others. We validate our proposed framework on several public datasets and show that it outperforms several state of art methods.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125849618","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 110

Video Classification Using Semantic Concept Co-occurrences 基于语义概念共现的视频分类

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.324

Shayan Modiri Assari, A. Zamir, M. Shah

引用次数: 38

Who Do I Look Like? Determining Parent-Offspring Resemblance via Gated Autoencoders 我长得像谁?通过门控自动编码器确定亲代相似性

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.227

Afshin Dehghan, E. Ortiz, Ruben Villegas, M. Shah

{"title":"Who Do I Look Like? Determining Parent-Offspring Resemblance via Gated Autoencoders","authors":"Afshin Dehghan, E. Ortiz, Ruben Villegas, M. Shah","doi":"10.1109/CVPR.2014.227","DOIUrl":"https://doi.org/10.1109/CVPR.2014.227","url":null,"abstract":"Recent years have seen a major push for face recognition technology due to the large expansion of image sharing on social networks. In this paper, we consider the difficult task of determining parent-offspring resemblance using deep learning to answer the question \"Who do I look like?\" Although humans can perform this job at a rate higher than chance, it is not clear how they do it [2]. However, recent studies in anthropology [24] have determined which features tend to be the most discriminative. In this study, we aim to not only create an accurate system for resemblance detection, but bridge the gap between studies in anthropology with computer vision techniques. Further, we aim to answer two key questions: 1) Do offspring resemble their parents? and 2) Do offspring resemble one parent more than the other? We propose an algorithm that fuses the features and metrics discovered via gated autoencoders with a discriminative neural network layer that learns the optimal, or what we call genetic, features to delineate parent-offspring relationships. We further analyze the correlation between our automatically detected features and those found in anthropological studies. Meanwhile, our method outperforms the state-of-the-art in kinship verification by 3-10% depending on the relationship using specific (father-son, mother-daughter, etc.) and generic models.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125048884","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 111

Facial Expression Recognition via a Boosted Deep Belief Network 基于增强深度信念网络的面部表情识别

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.233

Ping Liu, Shizhong Han, Zibo Meng, Yan Tong

{"title":"Facial Expression Recognition via a Boosted Deep Belief Network","authors":"Ping Liu, Shizhong Han, Zibo Meng, Yan Tong","doi":"10.1109/CVPR.2014.233","DOIUrl":"https://doi.org/10.1109/CVPR.2014.233","url":null,"abstract":"A training process for facial expression recognition is usually performed sequentially in three individual stages: feature learning, feature selection, and classifier construction. Extensive empirical studies are needed to search for an optimal combination of feature representation, feature set, and classifier to achieve good recognition performance. This paper presents a novel Boosted Deep Belief Network (BDBN) for performing the three training stages iteratively in a unified loopy framework. Through the proposed BDBN framework, a set of features, which is effective to characterize expression-related facial appearance/shape changes, can be learned and selected to form a boosted strong classifier in a statistical way. As learning continues, the strong classifier is improved iteratively and more importantly, the discriminative capabilities of selected features are strengthened as well according to their relative importance to the strong classifier via a joint fine-tune process in the BDBN framework. Extensive experiments on two public databases showed that the BDBN framework yielded dramatic improvements in facial expression analysis.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128448575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 557

Temporal Segmentation of Egocentric Videos 自我中心视频的时间分割

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.325

Y. Poleg, Chetan Arora, Shmuel Peleg

{"title":"Temporal Segmentation of Egocentric Videos","authors":"Y. Poleg, Chetan Arora, Shmuel Peleg","doi":"10.1109/CVPR.2014.325","DOIUrl":"https://doi.org/10.1109/CVPR.2014.325","url":null,"abstract":"The use of wearable cameras makes it possible to record life logging egocentric videos. Browsing such long unstructured videos is time consuming and tedious. Segmentation into meaningful chapters is an important first step towards adding structure to egocentric videos, enabling efficient browsing, indexing and summarization of the long videos. Two sources of information for video segmentation are (i) the motion of the camera wearer, and (ii) the objects and activities recorded in the video. In this paper we address the motion cues for video segmentation. Motion based segmentation is especially difficult in egocentric videos when the camera is constantly moving due to natural head movement of the wearer. We propose a robust temporal segmentation of egocentric videos into a hierarchy of motion classes using a new Cumulative Displacement Curves. Unlike instantaneous motion vectors, segmentation using integrated motion vectors performs well even in dynamic and crowded scenes. No assumptions are made on the underlying scene structure and the method works in indoor as well as outdoor situations. We demonstrate the effectiveness of our approach using publicly available videos as well as choreographed videos. We also suggest an approach to detect the fixation of wearer's gaze in the walking portion of the egocentric videos.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128483469","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 171

Sparse Dictionary Learning for Edit Propagation of High-Resolution Images 用于高分辨率图像编辑传播的稀疏字典学习

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.365

Xiaowu Chen, Dongqing Zou, Jianwei Li, Xiaochun Cao, Qinping Zhao, Hao Zhang

{"title":"Sparse Dictionary Learning for Edit Propagation of High-Resolution Images","authors":"Xiaowu Chen, Dongqing Zou, Jianwei Li, Xiaochun Cao, Qinping Zhao, Hao Zhang","doi":"10.1109/CVPR.2014.365","DOIUrl":"https://doi.org/10.1109/CVPR.2014.365","url":null,"abstract":"We introduce a method of sparse dictionary learning for edit propagation of high-resolution images or video. Previous approaches for edit propagation typically employ a global optimization over the whole set of image pixels, incurring a prohibitively high memory and time consumption for high-resolution images. Rather than propagating an edit pixel by pixel, we follow the principle of sparse representation to obtain a compact set of representative samples (or features) and perform edit propagation on the samples instead. The sparse set of samples provides an intrinsic basis for an input image, and the coding coefficients capture the linear relationship between all pixels and the samples. The representative set of samples is then optimized by a novel scheme which maximizes the KL-divergence between each sample pair to remove redundant samples. We show several applications of sparsity-based edit propagation including video recoloring, theme editing, and seamless cloning, operating on both color and texture features. We demonstrate that with a sample-to-pixel ratio in the order of 0.01%, signifying a significant reduction on memory consumption, our method still maintains a high-degree of visual fidelity.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129655567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 42

Learning to Detect Ground Control Points for Improving the Accuracy of Stereo Matching 学习检测地面控制点以提高立体匹配精度

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.210

Aristotle Spyropoulos, N. Komodakis, Philippos Mordohai

引用次数: 108

Complex Non-rigid Motion 3D Reconstruction by Union of Subspaces 基于子空间并集的复杂非刚体运动三维重建

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.200

Yingying Zhu, Dong Huang, F. D. L. Torre, S. Lucey

引用次数: 133

Photometric Stereo Using Constrained Bivariate Regression for General Isotropic Surfaces 一般各向同性表面的约束二元回归光度立体

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.280

Satoshi Ikehata, K. Aizawa

引用次数: 98

Geometric Generative Gaze Estimation (G3E) for Remote RGB-D Cameras 远程RGB-D相机的几何生成凝视估计(G3E)

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.229

Kenneth Alberto Funes Mora, J. Odobez

引用次数: 79