2014 IEEE Conference on Computer Vision and Pattern Recognition最新文献_第3页

On Projective Reconstruction in Arbitrary Dimensions 关于任意维的投影重建

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.68

B. Nasihatkon, R. Hartley, J. Trumpf

{"title":"On Projective Reconstruction in Arbitrary Dimensions","authors":"B. Nasihatkon, R. Hartley, J. Trumpf","doi":"10.1109/CVPR.2014.68","DOIUrl":"https://doi.org/10.1109/CVPR.2014.68","url":null,"abstract":"We study the theory of projective reconstruction for multiple projections from an arbitrary dimensional projective space into lower-dimensional spaces. This problem is important due to its applications in the analysis of dynamical scenes. The current theory, due to Hartley and Schaffalitzky, is based on the Grassmann tensor, generalizing the ideas of fundamental matrix, trifocal tensor and quadrifocal tensor used in the well-studied case of 3D to 2D projections. We present a theory whose point of departure is the projective equations rather than the Grassmann tensor. This is a better fit for the analysis of approaches such as bundle adjustment and projective factorization which seek to directly solve the projective equations. In a first step, we prove that there is a unique Grassmann tensor corresponding to each set of image points, a question that remained open in the work of Hartley and Schaffalitzky. Then, we prove that projective equivalence follows from the set of projective equations given certain conditions on the estimated camera-point setup or the estimated projective depths. Finally, we demonstrate how wrong solutions to the projective factorization problem can happen, and classify such degenerate solutions based on the zero patterns in the estimated depth matrix.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130338969","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Seeing 3D Chairs: Exemplar Part-Based 2D-3D Alignment Using a Large Dataset of CAD Models 看到3D椅子:使用大型CAD模型数据集的基于零件的范例2D-3D对齐

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.487

Mathieu Aubry, Daniel Maturana, Alexei A. Efros, Bryan C. Russell, Josef Sivic

引用次数: 508

Optimizing Average Precision Using Weakly Supervised Data 利用弱监督数据优化平均精度

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.133

Aseem Behl, Iiit Hyderabad, India C V Jawahar, India M Pawan Kumar

引用次数: 16

Convolutional Neural Networks for No-Reference Image Quality Assessment 卷积神经网络在无参考图像质量评估中的应用

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.224

Le Kang, Peng Ye, Yi Li, D. Doermann

引用次数: 860

Multi-output Learning for Camera Relocalization 摄像机重定位的多输出学习

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.146

Abner Guzmán-Rivera, Pushmeet Kohli, B. Glocker, J. Shotton, T. Sharp, A. Fitzgibbon, S. Izadi

{"title":"Multi-output Learning for Camera Relocalization","authors":"Abner Guzmán-Rivera, Pushmeet Kohli, B. Glocker, J. Shotton, T. Sharp, A. Fitzgibbon, S. Izadi","doi":"10.1109/CVPR.2014.146","DOIUrl":"https://doi.org/10.1109/CVPR.2014.146","url":null,"abstract":"We address the problem of estimating the pose of a cam- era relative to a known 3D scene from a single RGB-D frame. We formulate this problem as inversion of the generative rendering procedure, i.e., we want to find the camera pose corresponding to a rendering of the 3D scene model that is most similar with the observed input. This is a non-convex optimization problem with many local optima. We propose a hybrid discriminative-generative learning architecture that consists of: (i) a set of M predictors which generate M camera pose hypotheses, and (ii) a 'selector' or 'aggregator' that infers the best pose from the multiple pose hypotheses based on a similarity function. We are interested in predictors that not only produce good hypotheses but also hypotheses that are different from each other. Thus, we propose and study methods for learning 'marginally relevant' predictors, and compare their performance when used with different selection procedures. We evaluate our method on a recently released 3D reconstruction dataset with challenging camera poses, and scene variability. Experiments show that our method learns to make multiple predictions that are marginally relevant and can effectively select an accurate prediction. Furthermore, our method outperforms the state-of-the-art discriminative approach for camera relocalization.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123851126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 98

The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities 行动的语言:恢复目标导向的人类活动的句法和语义

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.105

Hilde Kuehne, A. B. Arslan, Thomas Serre

{"title":"The Language of Actions: Recovering the Syntax and Semantics of Goal-Directed Human Activities","authors":"Hilde Kuehne, A. B. Arslan, Thomas Serre","doi":"10.1109/CVPR.2014.105","DOIUrl":"https://doi.org/10.1109/CVPR.2014.105","url":null,"abstract":"This paper describes a framework for modeling human activities as temporally structured processes. Our approach is motivated by the inherently hierarchical nature of human activities and the close correspondence between human actions and speech: We model action units using Hidden Markov Models, much like words in speech. These action units then form the building blocks to model complex human activities as sentences using an action grammar. To evaluate our approach, we collected a large dataset of daily cooking activities: The dataset includes a total of 52 participants, each performing a total of 10 cooking activities in multiple real-life kitchens, resulting in over 77 hours of video footage. We evaluate the HTK toolkit, a state-of-the-art speech recognition engine, in combination with multiple video feature descriptors, for both the recognition of cooking activities (e.g., making pancakes) as well as the semantic parsing of videos into action units (e.g., cracking eggs). Our results demonstrate the benefits of structured temporal generative approaches over existing discriminative approaches in coping with the complexity of human daily life activities.","PeriodicalId":319578,"journal":{"name":"2014 IEEE Conference on Computer Vision and Pattern Recognition","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123877634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 452

Tell Me What You See and I Will Show You Where It Is 告诉我你看到了什么，我会告诉你它在哪里

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.408

Jia Xu, A. Schwing, R. Urtasun

引用次数: 94

Partial Symmetry in Polynomial Systems and Its Applications in Computer Vision 多项式系统的部分对称性及其在计算机视觉中的应用

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.63

Yubin Kuang, Yinqiang Zheng, Kalle Åström

引用次数: 14

Laplacian Coordinates for Seeded Image Segmentation 种子图像分割的拉普拉斯坐标

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.56

Wallace Casaca, L. G. Nonato, G. Taubin

引用次数: 56

Finding Matches in a Haystack: A Max-Pooling Strategy for Graph Matching in the Presence of Outliers 在大海捞针中寻找匹配:存在异常值的图匹配的最大池化策略

2014 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2014-06-23 DOI: 10.1109/CVPR.2014.268

Minsu Cho, Jian Sun, Olivier Duchenne, J. Ponce

引用次数: 121