2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)最新文献_第6页

Coordinating Multiple Disparity Proposals for Stereo Computation 协调多视差立体计算方案

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.436

Ang Li, Dapeng Chen, Yuanliu Liu, Zejian Yuan

引用次数: 29

MDL-CW: A Multimodal Deep Learning Framework with CrossWeights 基于交叉权重的多模态深度学习框架

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.285

Sarah Rastegar, M. Baghshah, H. Rabiee, Seyed Mohsen Shojaee

{"title":"MDL-CW: A Multimodal Deep Learning Framework with CrossWeights","authors":"Sarah Rastegar, M. Baghshah, H. Rabiee, Seyed Mohsen Shojaee","doi":"10.1109/CVPR.2016.285","DOIUrl":"https://doi.org/10.1109/CVPR.2016.285","url":null,"abstract":"Deep learning has received much attention as of the most powerful approaches for multimodal representation learning in recent years. An ideal model for multimodal data can reason about missing modalities using the available ones, and usually provides more information when multiple modalities are being considered. All the previous deep models contain separate modality-specific networks and find a shared representation on top of those networks. Therefore, they only consider high level interactions between modalities to find a joint representation for them. In this paper, we propose a multimodal deep learning framework (MDLCW) that exploits the cross weights between representation of modalities, and try to gradually learn interactions of the modalities in a deep network manner (from low to high level interactions). Moreover, we theoretically show that considering these interactions provide more intra-modality information, and introduce a multi-stage pre-training method that is based on the properties of multi-modal data. In the proposed framework, as opposed to the existing deep methods for multi-modal data, we try to reconstruct the representation of each modality at a given level, with representation of other modalities in the previous layer. Extensive experimental results show that the proposed model outperforms state-of-the-art information retrieval methods for both image and text queries on the PASCAL-sentence and SUN-Attribute databases.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"48 1","pages":"2601-2609"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79252352","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

A Task-Oriented Approach for Cost-Sensitive Recognition 面向任务的成本敏感识别方法

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.242

Roozbeh Mottaghi, Hannaneh Hajishirzi, Ali Farhadi

引用次数: 4

A Weighted Variational Model for Simultaneous Reflectance and Illumination Estimation 同时估计反射率和照度的加权变分模型

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.304

Xueyang Fu, Delu Zeng, Yue Huang, Xiao-Ping Zhang, Xinghao Ding

引用次数: 642

Face2Face: Real-Time Face Capture and Reenactment of RGB Videos Face2Face:实时人脸捕捉和再现RGB视频

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1145/3292039

Justus Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, M. Nießner

{"title":"Face2Face: Real-Time Face Capture and Reenactment of RGB Videos","authors":"Justus Thies, M. Zollhöfer, M. Stamminger, C. Theobalt, M. Nießner","doi":"10.1145/3292039","DOIUrl":"https://doi.org/10.1145/3292039","url":null,"abstract":"We present a novel approach for real-time facial reenactment of a monocular target video sequence (e.g., Youtube video). The source sequence is also a monocular video stream, captured live with a commodity webcam. Our goal is to animate the facial expressions of the target video by a source actor and re-render the manipulated output video in a photo-realistic fashion. To this end, we first address the under-constrained problem of facial identity recovery from monocular video by non-rigid model-based bundling. At run time, we track facial expressions of both source and target video using a dense photometric consistency measure. Reenactment is then achieved by fast and efficient deformation transfer between source and target. The mouth interior that best matches the re-targeted expression is retrieved from the target sequence and warped to produce an accurate fit. Finally, we convincingly re-render the synthesized target face on top of the corresponding video stream such that it seamlessly blends with the real-world illumination. We demonstrate our method in a live setup, where Youtube videos are reenacted in real time.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"15 1","pages":"2387-2395"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72617275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1552

Learning Aligned Cross-Modal Representations from Weakly Aligned Data 从弱对齐数据中学习对齐的跨模态表示

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.321

Lluís Castrejón, Y. Aytar, Carl Vondrick, H. Pirsiavash, A. Torralba

引用次数: 158

Monocular Depth Estimation Using Neural Regression Forest 基于神经回归森林的单目深度估计

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.594

Anirban Roy, S. Todorovic

引用次数: 292

Canny Text Detector: Fast and Robust Scene Text Localization Algorithm Canny文本检测器:快速鲁棒的场景文本定位算法

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.388

Hojin Cho, Myung-Chul Sung, Bongjin Jun

引用次数: 104

Object Co-segmentation via Graph Optimized-Flexible Manifold Ranking 基于图优化-柔性流形排序的目标共分割

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.81

Rong Quan, Junwei Han, Dingwen Zhang, F. Nie

引用次数: 79

Structural Correlation Filter for Robust Visual Tracking 鲁棒视觉跟踪的结构相关滤波器

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.467

Si Liu, Tianzhu Zhang, Xiaochun Cao, Changsheng Xu

{"title":"Structural Correlation Filter for Robust Visual Tracking","authors":"Si Liu, Tianzhu Zhang, Xiaochun Cao, Changsheng Xu","doi":"10.1109/CVPR.2016.467","DOIUrl":"https://doi.org/10.1109/CVPR.2016.467","url":null,"abstract":"In this paper, we propose a novel structural correlation filter (SCF) model for robust visual tracking. The proposed SCF model takes part-based tracking strategies into account in a correlation filter tracker, and exploits circular shifts of all parts for their motion modeling to preserve target object structure. Compared with existing correlation filter trackers, our proposed tracker has several advantages: (1) Due to the part strategy, the learned structural correlation filters are less sensitive to partial occlusion, and have computational efficiency and robustness. (2) The learned filters are able to not only distinguish the parts from the background as the traditional correlation filters, but also exploit the intrinsic relationship among local parts via spatial constraints to preserve object structure. (3) The learned correlation filters not only make most parts share similar motion, but also tolerate outlier parts that have different motion. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed SCF tracking algorithm performs favorably against several state-of-the-art methods.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"535 ","pages":"4312-4320"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91450065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 165