2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)最新文献

筛选
英文 中文
Similarity Learning with Spatial Constraints for Person Re-identification 空间约束下的相似性学习对人物再识别的影响
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.142
Dapeng Chen, Zejian Yuan, Badong Chen, Nanning Zheng
{"title":"Similarity Learning with Spatial Constraints for Person Re-identification","authors":"Dapeng Chen, Zejian Yuan, Badong Chen, Nanning Zheng","doi":"10.1109/CVPR.2016.142","DOIUrl":"https://doi.org/10.1109/CVPR.2016.142","url":null,"abstract":"Pose variation remains one of the major factors that adversely affect the accuracy of person re-identification. Such variation is not arbitrary as body parts (e.g. head, torso, legs) have relative stable spatial distribution. Breaking down the variability of global appearance regarding the spatial distribution potentially benefits the person matching. We therefore learn a novel similarity function, which consists of multiple sub-similarity measurements with each taking in charge of a subregion. In particular, we take advantage of the recently proposed polynomial feature map to describe the matching within each subregion, and inject all the feature maps into a unified framework. The framework not only outputs similarity measurements for different regions, but also makes a better consistency among them. Our framework can collaborate local similarities as well as global similarity to exploit their complementary strength. It is flexible to incorporate multiple visual cues to further elevate the performance. In experiments, we analyze the effectiveness of the major components. The results on four datasets show significant and consistent improvements over the state-of-the-art methods.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"49 1","pages":"1268-1277"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80715111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 340
Constrained Joint Cascade Regression Framework for Simultaneous Facial Action Unit Recognition and Facial Landmark Detection 基于约束联合级联回归框架的人脸动作单元识别与人脸标记检测
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.370
Yue Wu, Q. Ji
{"title":"Constrained Joint Cascade Regression Framework for Simultaneous Facial Action Unit Recognition and Facial Landmark Detection","authors":"Yue Wu, Q. Ji","doi":"10.1109/CVPR.2016.370","DOIUrl":"https://doi.org/10.1109/CVPR.2016.370","url":null,"abstract":"Cascade regression framework has been shown to be effective for facial landmark detection. It starts from an initial face shape and gradually predicts the face shape update from the local appearance features to generate the facial landmark locations in the next iteration until convergence. In this paper, we improve upon the cascade regression framework and propose the Constrained Joint Cascade Regression Framework (CJCRF) for simultaneous facial action unit recognition and facial landmark detection, which are two related face analysis tasks, but are seldomly exploited together. In particular, we first learn the relationships among facial action units and face shapes as a constraint. Then, in the proposed constrained joint cascade regression framework, with the help from the constraint, we iteratively update the facial landmark locations and the action unit activation probabilities until convergence. Experimental results demonstrate that the intertwined relationships of facial action units and face shapes boost the performances of both facial action unit recognition and facial landmark detection. The experimental results also demonstrate the effectiveness of the proposed method comparing to the state-of-the-art works.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"78 1","pages":"3400-3408"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81284668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 73
Theory and Practice of Structure-From-Motion Using Affine Correspondences 基于仿射对应的运动构造理论与实践
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.590
Carolina Raposo, J. Barreto
{"title":"Theory and Practice of Structure-From-Motion Using Affine Correspondences","authors":"Carolina Raposo, J. Barreto","doi":"10.1109/CVPR.2016.590","DOIUrl":"https://doi.org/10.1109/CVPR.2016.590","url":null,"abstract":"Affine Correspondences (ACs) are more informative than Point Correspondences (PCs) that are used as input in mainstream algorithms for Structure-from-Motion (SfM). Since ACs enable to estimate models from fewer correspondences, its use can dramatically reduce the number of combinations during the iterative step of sample-and-test that exists in most SfM pipelines. However, using ACs instead of PCs as input for SfM passes by fully understanding the relations between ACs and multi-view geometry, as well as by establishing practical, effective AC-based algorithms. This article is a step forward into this direction, by providing a clear account about how ACs constrain the two-view geometry, and by proposing new algorithms for plane segmentation and visual odometry that compare favourably with respect to methods relying in PCs.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"11 1","pages":"5470-5478"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87592964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 52
Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification 人物再识别的单图像与交叉图像表征联合学习
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.144
Faqiang Wang, W. Zuo, Liang Lin, D. Zhang, Lei Zhang
{"title":"Joint Learning of Single-Image and Cross-Image Representations for Person Re-identification","authors":"Faqiang Wang, W. Zuo, Liang Lin, D. Zhang, Lei Zhang","doi":"10.1109/CVPR.2016.144","DOIUrl":"https://doi.org/10.1109/CVPR.2016.144","url":null,"abstract":"Person re-identification has been usually solved as either the matching of single-image representation (SIR) or the classification of cross-image representation (CIR). In this work, we exploit the connection between these two categories of methods, and propose a joint learning frame-work to unify SIR and CIR using convolutional neural network (CNN). Specifically, our deep architecture contains one shared sub-network together with two sub-networks that extract the SIRs of given images and the CIRs of given image pairs, respectively. The SIR sub-network is required to be computed once for each image (in both the probe and gallery sets), and the depth of the CIR sub-network is required to be minimal to reduce computational burden. Therefore, the two types of representation can be jointly optimized for pursuing better matching accuracy with moderate computational cost. Furthermore, the representations learned with pairwise comparison and triplet comparison objectives can be combined to improve matching performance. Experiments on the CUHK03, CUHK01 and VIPeR datasets show that the proposed method can achieve favorable accuracy while compared with state-of-the-arts.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"21 1","pages":"1288-1296"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85207450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 366
Predicting the Where and What of Actors and Actions through Online Action Localization 通过在线动作定位预测参与者和动作的位置和内容
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.290
K. Soomro, Haroon Idrees, M. Shah
{"title":"Predicting the Where and What of Actors and Actions through Online Action Localization","authors":"K. Soomro, Haroon Idrees, M. Shah","doi":"10.1109/CVPR.2016.290","DOIUrl":"https://doi.org/10.1109/CVPR.2016.290","url":null,"abstract":"This paper proposes a novel approach to tackle the challenging problem of 'online action localization' which entails predicting actions and their locations as they happen in a video. Typically, action localization or recognition is performed in an offline manner where all the frames in the video are processed together and action labels are not predicted for the future. This disallows timely localization of actions - an important consideration for surveillance tasks. In our approach, given a batch of frames from the immediate past in a video, we estimate pose and oversegment the current frame into superpixels. Next, we discriminatively train an actor foreground model on the superpixels using the pose bounding boxes. A Conditional Random Field with superpixels as nodes, and edges connecting spatio-temporal neighbors is used to obtain action segments. The action confidence is predicted using dynamic programming on SVM scores obtained on short segments of the video, thereby capturing sequential information of the actions. The issue of visual drift is handled by updating the appearance model and pose refinement in an online manner. Lastly, we introduce a new measure to quantify the performance of action prediction (i.e. online action localization), which analyzes how the prediction accuracy varies as a function of observed portion of the video. Our experiments suggest that despite using only a few frames to localize actions at each time instant, we are able to predict the action and obtain competitive results to state-of-the-art offline methods.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"10 1","pages":"2648-2657"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91190230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 87
Self-Adaptive Matrix Completion for Heart Rate Estimation from Face Videos under Realistic Conditions 现实条件下人脸视频心率估计的自适应矩阵补全
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.263
S. Tulyakov, Xavier Alameda-Pineda, E. Ricci, L. Yin, J. Cohn, N. Sebe
{"title":"Self-Adaptive Matrix Completion for Heart Rate Estimation from Face Videos under Realistic Conditions","authors":"S. Tulyakov, Xavier Alameda-Pineda, E. Ricci, L. Yin, J. Cohn, N. Sebe","doi":"10.1109/CVPR.2016.263","DOIUrl":"https://doi.org/10.1109/CVPR.2016.263","url":null,"abstract":"Recent studies in computer vision have shown that, while practically invisible to a human observer, skin color changes due to blood flow can be captured on face videos and, surprisingly, be used to estimate the heart rate (HR). While considerable progress has been made in the last few years, still many issues remain open. In particular, state of-the-art approaches are not robust enough to operate in natural conditions (e.g. in case of spontaneous movements, facial expressions, or illumination changes). Opposite to previous approaches that estimate the HR by processing all the skin pixels inside a fixed region of interest, we introduce a strategy to dynamically select face regions useful for robust HR estimation. Our approach, inspired by recent advances on matrix completion theory, allows us to predict the HR while simultaneously discover the best regions of the face to be used for estimation. Thorough experimental evaluation conducted on public benchmarks suggests that the proposed approach significantly outperforms state-of the-art HR estimation methods in naturalistic conditions.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"39 1","pages":"2396-2404"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90238668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 240
A Probabilistic Framework for Color-Based Point Set Registration 基于颜色的点集配准的概率框架
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.201
Martin Danelljan, G. Meneghetti, F. Khan, M. Felsberg
{"title":"A Probabilistic Framework for Color-Based Point Set Registration","authors":"Martin Danelljan, G. Meneghetti, F. Khan, M. Felsberg","doi":"10.1109/CVPR.2016.201","DOIUrl":"https://doi.org/10.1109/CVPR.2016.201","url":null,"abstract":"In recent years, sensors capable of measuring both color and depth information have become increasingly popular. Despite the abundance of colored point set data, stateof-the-art probabilistic registration techniques ignore the available color information. In this paper, we propose a probabilistic point set registration framework that exploits available color information associated with the points. Our method is based on a model of the joint distribution of 3D-point observations and their color information. The proposed model captures discriminative color information, while being computationally efficient. We derive an EM algorithm for jointly estimating the model parameters and the relative transformations. Comprehensive experiments are performed on the Stanford Lounge dataset, captured by an RGB-D camera, and two point sets captured by a Lidar sensor. Our results demonstrate a significant gain in robustness and accuracy when incorporating color information. On the Stanford Lounge dataset, our approach achieves a relative reduction of the failure rate by 78% compared to the baseline. Furthermore, our proposed model outperforms standard strategies for combining color and 3D-point information, leading to state-of-the-art results.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"685 1","pages":"1818-1826"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76874082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Laplacian Patch-Based Image Synthesis 基于拉普拉斯补丁的图像合成
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.298
J. Lee, Inchang Choi, Min H. Kim
{"title":"Laplacian Patch-Based Image Synthesis","authors":"J. Lee, Inchang Choi, Min H. Kim","doi":"10.1109/CVPR.2016.298","DOIUrl":"https://doi.org/10.1109/CVPR.2016.298","url":null,"abstract":"Patch-based image synthesis has been enriched with global optimization on the image pyramid. Successively, the gradient-based synthesis has improved structural coherence and details. However, the gradient operator is directional and inconsistent and requires computing multiple operators. It also introduces a significantly heavy computational burden to solve the Poisson equation that often accompanies artifacts in non-integrable gradient fields. In this paper, we propose a patch-based synthesis using a Laplacian pyramid to improve searching correspondence with enhanced awareness of edge structures. Contrary to the gradient operators, the Laplacian pyramid has the advantage of being isotropic in detecting changes to provide more consistent performance in decomposing the base structure and the detailed localization. Furthermore, it does not require heavy computation as it employs approximation by the differences of Gaussians. We examine the potentials of the Laplacian pyramid for enhanced edge-aware correspondence search. We demonstrate the effectiveness of the Laplacian-based approach over the state-of-the-art patchbased image synthesis methods.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"25 1","pages":"2727-2735"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84018229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 38
Multi-view People Tracking via Hierarchical Trajectory Composition 基于分层轨迹合成的多视角人物跟踪
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.461
Yuanlu Xu, Xiaobai Liu, Yang Liu, Song-Chun Zhu
{"title":"Multi-view People Tracking via Hierarchical Trajectory Composition","authors":"Yuanlu Xu, Xiaobai Liu, Yang Liu, Song-Chun Zhu","doi":"10.1109/CVPR.2016.461","DOIUrl":"https://doi.org/10.1109/CVPR.2016.461","url":null,"abstract":"This paper presents a hierarchical composition approach for multi-view object tracking. The key idea is to adaptively exploit multiple cues in both 2D and 3D, e.g., ground occupancy consistency, appearance similarity, motion coherence etc., which are mutually complementary while tracking the humans of interests over time. While feature online selection has been extensively studied in the past literature, it remains unclear how to effectively schedule these cues for the tracking purpose especially when encountering various challenges, e.g. occlusions, conjunctions, and appearance variations. To do so, we propose a hierarchical composition model and re-formulate multi-view multi-object tracking as a problem of compositional structure optimization. We setup a set of composition criteria, each of which corresponds to one particular cue. The hierarchical composition process is pursued by exploiting different criteria, which impose constraints between a graph node and its offsprings in the hierarchy. We learn the composition criteria using MLE on annotated data and efficiently construct the hierarchical graph by an iterative greedy pursuit algorithm. In the experiments, we demonstrate superior performance of our approach on three public datasets, one of which is newly created by us to test various challenges in multi-view multi-object tracking.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"11 1","pages":"4256-4265"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79707481","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 121
Learning with Side Information through Modality Hallucination 通过模态幻觉学习副信息
2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.96
Judy Hoffman, Saurabh Gupta, Trevor Darrell
{"title":"Learning with Side Information through Modality Hallucination","authors":"Judy Hoffman, Saurabh Gupta, Trevor Darrell","doi":"10.1109/CVPR.2016.96","DOIUrl":"https://doi.org/10.1109/CVPR.2016.96","url":null,"abstract":"We present a modality hallucination architecture for training an RGB object detection model which incorporates depth side information at training time. Our convolutional hallucination network learns a new and complementary RGB image representation which is taught to mimic convolutional mid-level features from a depth network. At test time images are processed jointly through the RGB and hallucination networks to produce improved detection performance. Thus, our method transfers information commonly extracted from depth training data to a network which can extract that information from the RGB counterpart. We present results on the standard NYUDv2 dataset and report improvement on the RGB detection task.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"21 1","pages":"826-834"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77915820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 199
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信