2015 IEEE International Conference on Computer Vision (ICCV)最新文献

筛选
英文 中文
Visual Madlibs: Fill in the Blank Description Generation and Question Answering Visual Madlibs:填空描述生成和问题回答
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.283
Licheng Yu, Eunbyung Park, A. Berg, Tamara L. Berg
{"title":"Visual Madlibs: Fill in the Blank Description Generation and Question Answering","authors":"Licheng Yu, Eunbyung Park, A. Berg, Tamara L. Berg","doi":"10.1109/ICCV.2015.283","DOIUrl":"https://doi.org/10.1109/ICCV.2015.283","url":null,"abstract":"In this paper, we introduce a new dataset consisting of 360,001 focused natural language descriptions for 10,738 images. This dataset, the Visual Madlibs dataset, is collected using automatically produced fill-in-the-blank templates designed to gather targeted descriptions about: people and objects, their appearances, activities, and interactions, as well as inferences about the general scene or its broader context. We provide several analyses of the Visual Madlibs dataset and demonstrate its applicability to two new description generation tasks: focused description generation, and multiple-choice question-answering for images. Experiments using joint-embedding and deep learning methods show promising results on these tasks.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"C1 1","pages":"2461-2469"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85197306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 135
Attributed Grammars for Joint Estimation of Human Attributes, Part and Pose 人体属性、部位和姿势联合估计的属性语法
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.273
Seyoung Park, Song-Chun Zhu
{"title":"Attributed Grammars for Joint Estimation of Human Attributes, Part and Pose","authors":"Seyoung Park, Song-Chun Zhu","doi":"10.1109/ICCV.2015.273","DOIUrl":"https://doi.org/10.1109/ICCV.2015.273","url":null,"abstract":"In this paper, we are interested in developing compositional models to explicit representing pose, parts and attributes and tackling the tasks of attribute recognition, pose estimation and part localization jointly. This is different from the recent trend of using CNN-based approaches for training and testing on these tasks separately with a large amount of data. Conventional attribute models typically use a large number of region-based attribute classifiers on parts of pre-trained pose estimator without explicitly detecting the object or its parts, or considering the correlations between attributes. In contrast, our approach jointly represents both the object parts and their semantic attributes within a unified compositional hierarchy. We apply our attributed grammar model to the task of human parsing by simultaneously performing part localization and attribute recognition. We show our modeling helps performance improvements on pose-estimation task and also outperforms on other existing methods on attribute prediction task.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"43 1","pages":"2372-2380"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85480385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Sparse Dynamic 3D Reconstruction from Unsynchronized Videos 从非同步视频稀疏动态3D重建
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.504
Enliang Zheng, Dinghuang Ji, Enrique Dunn, Jan-Michael Frahm
{"title":"Sparse Dynamic 3D Reconstruction from Unsynchronized Videos","authors":"Enliang Zheng, Dinghuang Ji, Enrique Dunn, Jan-Michael Frahm","doi":"10.1109/ICCV.2015.504","DOIUrl":"https://doi.org/10.1109/ICCV.2015.504","url":null,"abstract":"We target the sparse 3D reconstruction of dynamic objects observed by multiple unsynchronized video cameras with unknown temporal overlap. To this end, we develop a framework to recover the unknown structure without sequencing information across video sequences. Our proposed compressed sensing framework poses the estimation of 3D structure as the problem of dictionary learning. Moreover, we define our dictionary as the temporally varying 3D structure, while we define local sequencing information in terms of the sparse coefficients describing a locally linear 3D structural interpolation. Our formulation optimizes a biconvex cost function that leverages a compressed sensing formulation and enforces both structural dependency coherence across video streams, as well as motion smoothness across estimates from common video sources. Experimental results demonstrate the effectiveness of our approach in both synthetic data and captured imagery.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"63 1","pages":"4435-4443"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83870310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Photogeometric Scene Flow for High-Detail Dynamic 3D Reconstruction 高细节动态三维重建的光几何场景流
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.103
P. Gotardo, T. Simon, Yaser Sheikh, I. Matthews
{"title":"Photogeometric Scene Flow for High-Detail Dynamic 3D Reconstruction","authors":"P. Gotardo, T. Simon, Yaser Sheikh, I. Matthews","doi":"10.1109/ICCV.2015.103","DOIUrl":"https://doi.org/10.1109/ICCV.2015.103","url":null,"abstract":"Photometric stereo (PS) is an established technique for high-detail reconstruction of 3D geometry and appearance. To correct for surface integration errors, PS is often combined with multiview stereo (MVS). With dynamic objects, PS reconstruction also faces the problem of computing optical flow (OF) for image alignment under rapid changes in illumination. Current PS methods typically compute optical flow and MVS as independent stages, each one with its own limitations and errors introduced by early regularization. In contrast, scene flow methods estimate geometry and motion, but lack the fine detail from PS. This paper proposes photogeometric scene flow (PGSF) for high-quality dynamic 3D reconstruction. PGSF performs PS, OF, and MVS simultaneously. It is based on two key observations: (i) while image alignment improves PS, PS allows for surfaces to be relit to improve alignment, (ii) PS provides surface gradients that render the smoothness term in MVS unnecessary, leading to truly data-driven, continuous depth estimates. This synergy is demonstrated in the quality of the resulting RGB appearance, 3D geometry, and 3D motion.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"16 1","pages":"846-854"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83153687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Depth Recovery from Light Field Using Focal Stack Symmetry 利用焦叠对称的光场深度恢复
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.394
Haiting Lin, Can Chen, S. B. Kang, Jingyi Yu
{"title":"Depth Recovery from Light Field Using Focal Stack Symmetry","authors":"Haiting Lin, Can Chen, S. B. Kang, Jingyi Yu","doi":"10.1109/ICCV.2015.394","DOIUrl":"https://doi.org/10.1109/ICCV.2015.394","url":null,"abstract":"We describe a technique to recover depth from a light field (LF) using two proposed features of the LF focal stack. One feature is the property that non-occluding pixels exhibit symmetry along the focal depth dimension centered at the in-focus slice. The other is a data consistency measure based on analysis-by-synthesis, i.e., the difference between the synthesized focal stack given the hypothesized depth map and that from the LF. These terms are used in an iterative optimization framework to extract scene depth. Experimental results on real Lytro and Raytrix data demonstrate that our technique outperforms state-of-the-art solutions and is significantly more robust to noise and under-sampling.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"66 1","pages":"3451-3459"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78730298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 134
A Novel Sparsity Measure for Tensor Recovery 一种新的张量恢复稀疏度测度
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.39
Qian Zhao, Deyu Meng, Xu Kong, Qi Xie, Wenfei Cao, Yao Wang, Zongben Xu
{"title":"A Novel Sparsity Measure for Tensor Recovery","authors":"Qian Zhao, Deyu Meng, Xu Kong, Qi Xie, Wenfei Cao, Yao Wang, Zongben Xu","doi":"10.1109/ICCV.2015.39","DOIUrl":"https://doi.org/10.1109/ICCV.2015.39","url":null,"abstract":"In this paper, we propose a new sparsity regularizer for measuring the low-rank structure underneath a tensor. The proposed sparsity measure has a natural physical meaning which is intrinsically the size of the fundamental Kronecker basis to express the tensor. By embedding the sparsity measure into the tensor completion and tensor robust PCA frameworks, we formulate new models to enhance their capability in tensor recovery. Through introducing relaxation forms of the proposed sparsity measure, we also adopt the alternating direction method of multipliers (ADMM) for solving the proposed models. Experiments implemented on synthetic and multispectral image data sets substantiate the effectiveness of the proposed methods.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"112 1","pages":"271-279"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90757572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Actionness-Assisted Recognition of Actions 动作-辅助动作识别
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.371
Ye Luo, L. Cheong, An Tran
{"title":"Actionness-Assisted Recognition of Actions","authors":"Ye Luo, L. Cheong, An Tran","doi":"10.1109/ICCV.2015.371","DOIUrl":"https://doi.org/10.1109/ICCV.2015.371","url":null,"abstract":"We elicit from a fundamental definition of action low-level attributes that can reveal agency and intentionality. These descriptors are mainly trajectory-based, measuring sudden changes, temporal synchrony, and repetitiveness. The actionness map can be used to localize actions in a way that is generic across action and agent types. Furthermore, it also groups interacting regions into a useful unit of analysis, which is crucial for recognition of actions involving interactions. We then implement an actionness-driven pooling scheme to improve action recognition performance. Experimental results on three datasets show the advantages of our method on both action detection and action recognition comparing with other state-of-the-art methods.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"20 1","pages":"3244-3252"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91153888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Action Localization in Videos through Context Walk 通过上下文行走在视频中的动作定位
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.375
K. Soomro, Haroon Idrees, M. Shah
{"title":"Action Localization in Videos through Context Walk","authors":"K. Soomro, Haroon Idrees, M. Shah","doi":"10.1109/ICCV.2015.375","DOIUrl":"https://doi.org/10.1109/ICCV.2015.375","url":null,"abstract":"This paper presents an efficient approach for localizing actions by learning contextual relations, in the form of relative locations between different video regions. We begin by over-segmenting the videos into supervoxels, which have the ability to preserve action boundaries and also reduce the complexity of the problem. Context relations are learned during training which capture displacements from all the supervoxels in a video to those belonging to foreground actions. Then, given a testing video, we select a supervoxel randomly and use the context information acquired during training to estimate the probability of each supervoxel belonging to the foreground action. The walk proceeds to a new supervoxel and the process is repeated for a few steps. This \"context walk\" generates a conditional distribution of an action over all the supervoxels. A Conditional Random Field is then used to find action proposals in the video, whose confidences are obtained using SVMs. We validated the proposed approach on several datasets and show that context in the form of relative displacements between supervoxels can be extremely useful for action localization. This also results in significantly fewer evaluations of the classifier, in sharp contrast to the alternate sliding window approaches.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"91 1","pages":"3280-3288"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90185039","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 74
An NMF Perspective on Binary Hashing 二进制哈希的NMF视角
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.476
L. Mukherjee, Sathya Ravi, V. Ithapu, Tyler Holmes, Vikas Singh
{"title":"An NMF Perspective on Binary Hashing","authors":"L. Mukherjee, Sathya Ravi, V. Ithapu, Tyler Holmes, Vikas Singh","doi":"10.1109/ICCV.2015.476","DOIUrl":"https://doi.org/10.1109/ICCV.2015.476","url":null,"abstract":"The pervasiveness of massive data repositories has led to much interest in efficient methods for indexing, search, and retrieval. For image data, a rapidly developing body of work for these applications shows impressive performance with methods that broadly fall under the umbrella term of Binary Hashing. Given a distance matrix, a binary hashing algorithm solves for a binary code for the given set of examples, whose Hamming distance nicely approximates the original distances. The formulation is non-convex -- so existing solutions adopt spectral relaxations or perform coordinate descent (or quantization) on a surrogate objective that is numerically more tractable. In this paper, we first derive an Augmented Lagrangian approach to optimize the standard binary Hashing objective (i.e.,maintain fidelity with a given distance matrix). With appropriate step sizes, we find that this scheme already yields results that match or substantially outperform state of the art methods on most benchmarks used in the literature. Then, to allow the model to scale to large datasets, we obtain an interesting reformulation of the binary hashing objective as a non negative matrix factorization. Later, this leads to a simple multiplicative updates algorithm -- whose parallelization properties are exploited to obtain a fast GPU based implementation. We give a probabilistic analysis of our initialization scheme and present a range of experiments to show that the method is simple to implement and competes favorably with available methods (both for optimization and generalization).","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"67 1","pages":"4184-4192"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90195759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
BodyPrint: Pose Invariant 3D Shape Matching of Human Bodies BodyPrint:人体姿势不变的3D形状匹配
2015 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2015-12-07 DOI: 10.1109/ICCV.2015.186
Jiangping Wang, Kai Ma, V. Singh, Thomas S. Huang, Terrence Chen
{"title":"BodyPrint: Pose Invariant 3D Shape Matching of Human Bodies","authors":"Jiangping Wang, Kai Ma, V. Singh, Thomas S. Huang, Terrence Chen","doi":"10.1109/ICCV.2015.186","DOIUrl":"https://doi.org/10.1109/ICCV.2015.186","url":null,"abstract":"3D human body shape matching has large potential on many real world applications, especially with the recent advances in the 3D range sensing technology. We address this problem by proposing a novel holistic human body shape descriptor called BodyPrint. To compute the bodyprint for a given body scan, we fit a deformable human body mesh and project the mesh parameters to a low-dimensional subspace which improves discriminability across different persons. Experiments are carried out on three real-world human body datasets to demonstrate that BodyPrint is robust to pose variation as well as missing information and sensor noise. It improves the matching accuracy significantly compared to conventional 3D shape matching techniques using local features. To facilitate practical applications where the shape database may grow over time, we also extend our learning framework to handle online updates.","PeriodicalId":6633,"journal":{"name":"2015 IEEE International Conference on Computer Vision (ICCV)","volume":"63 1","pages":"1591-1599"},"PeriodicalIF":0.0,"publicationDate":"2015-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90452229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信