2013 IEEE Conference on Computer Vision and Pattern Recognition最新文献_第6页

Action Recognition by Hierarchical Sequence Summarization 基于层次序列摘要的动作识别

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.457

Yale Song, Louis-Philippe Morency, Randall Davis

{"title":"Action Recognition by Hierarchical Sequence Summarization","authors":"Yale Song, Louis-Philippe Morency, Randall Davis","doi":"10.1109/CVPR.2013.457","DOIUrl":"https://doi.org/10.1109/CVPR.2013.457","url":null,"abstract":"Recent progress has shown that learning from hierarchical feature representations leads to improvements in various computer vision tasks. Motivated by the observation that human activity data contains information at various temporal resolutions, we present a hierarchical sequence summarization approach for action recognition that learns multiple layers of discriminative feature representations at different temporal granularities. We build up a hierarchy dynamically and recursively by alternating sequence learning and sequence summarization. For sequence learning we use CRFs with latent variables to learn hidden spatio-temporal dynamics, for sequence summarization we group observations that have similar semantic meaning in the latent space. For each layer we learn an abstract feature representation through non-linear gate functions. This procedure is repeated to obtain a hierarchical sequence summary representation. We develop an efficient learning method to train our model and show that its complexity grows sub linearly with the size of the hierarchy. Experimental results show the effectiveness of our approach, achieving the best published results on the Arm Gesture and Canal9 datasets.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"33 1","pages":"3562-3569"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79482796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 101

Relative Volume Constraints for Single View 3D Reconstruction 单视图三维重建的相对体积约束

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.30

Eno Töppe, C. Nieuwenhuis, D. Cremers

引用次数: 15

Spatiotemporal Deformable Part Models for Action Detection 用于动作检测的时空可变形部件模型

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.341

Yicong Tian, R. Sukthankar, M. Shah

引用次数: 268

Fast Convolutional Sparse Coding 快速卷积稀疏编码

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.57

H. Bristow, Anders P. Eriksson, S. Lucey

引用次数: 331

3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image 3D视觉接近学:从单个图像中识别3D中的人类互动

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.437

Ishani Chakraborty, Hui Cheng, O. Javed

{"title":"3D Visual Proxemics: Recognizing Human Interactions in 3D from a Single Image","authors":"Ishani Chakraborty, Hui Cheng, O. Javed","doi":"10.1109/CVPR.2013.437","DOIUrl":"https://doi.org/10.1109/CVPR.2013.437","url":null,"abstract":"We present a unified framework for detecting and classifying people interactions in unconstrained user generated images. Unlike previous approaches that directly map people/face locations in 2D image space into features for classification, we first estimate camera viewpoint and people positions in 3D space and then extract spatial configuration features from explicit 3D people positions. This approach has several advantages. First, it can accurately estimate relative distances and orientations between people in 3D. Second, it encodes spatial arrangements of people into a richer set of shape descriptors than afforded in 2D. Our 3D shape descriptors are invariant to camera pose variations often seen in web images and videos. The proposed approach also estimates camera pose and uses it to capture the intent of the photo. To achieve accurate 3D people layout estimation, we develop an algorithm that robustly fuses semantic constraints about human interpositions into a linear camera model. This enables our model to handle large variations in people size, heights (e.g. age) and poses. An accurate 3D layout also allows us to construct features informed by Proxemics that improves our semantic classification. To characterize the human interaction space, we introduce visual proxemes, a set of prototypical patterns that represent commonly occurring social interactions in events. We train a discriminative classifier that classifies 3D arrangements of people into visual proxemes and quantitatively evaluate the performance on a large, challenging dataset.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"30 1","pages":"3406-3413"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83336180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 21

Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds 体素云连接分割-点云的超体素

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.264

Jeremie Papon, A. Abramov, Markus Schoeler, F. Wörgötter

{"title":"Voxel Cloud Connectivity Segmentation - Supervoxels for Point Clouds","authors":"Jeremie Papon, A. Abramov, Markus Schoeler, F. Wörgötter","doi":"10.1109/CVPR.2013.264","DOIUrl":"https://doi.org/10.1109/CVPR.2013.264","url":null,"abstract":"Unsupervised over-segmentation of an image into regions of perceptually similar pixels, known as super pixels, is a widely used preprocessing step in segmentation algorithms. Super pixel methods reduce the number of regions that must be considered later by more computationally expensive algorithms, with a minimal loss of information. Nevertheless, as some information is inevitably lost, it is vital that super pixels not cross object boundaries, as such errors will propagate through later steps. Existing methods make use of projected color or depth information, but do not consider three dimensional geometric relationships between observed data points which can be used to prevent super pixels from crossing regions of empty space. We propose a novel over-segmentation algorithm which uses voxel relationships to produce over-segmentations which are fully consistent with the spatial geometry of the scene in three dimensional, rather than projective, space. Enforcing the constraint that segmented regions must have spatial connectivity prevents label flow across semantic object boundaries which might otherwise be violated. Additionally, as the algorithm works directly in 3D space, observations from several calibrated RGB+D cameras can be segmented jointly. Experiments on a large data set of human annotated RGB+D images demonstrate a significant reduction in occurrence of clusters crossing object boundaries, while maintaining speeds comparable to state-of-the-art 2D methods.","PeriodicalId":6343,"journal":{"name":"2013 IEEE Conference on Computer Vision and Pattern Recognition","volume":"99 1","pages":"2027-2034"},"PeriodicalIF":0.0,"publicationDate":"2013-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81372882","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 497

Winding Number for Region-Boundary Consistent Salient Contour Extraction 区域边界一致凸轮廓提取的绕组数

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.363

Y. Ming, Hongdong Li, Xuming He

引用次数: 10

Expressive Visual Text-to-Speech Using Active Appearance Models 使用主动外观模型表达视觉文本到语音

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.434

Robert Anderson, B. Stenger, V. Wan, R. Cipolla

引用次数: 84

Three-Dimensional Bilateral Symmetry Plane Estimation in the Phase Domain 相位域三维双边对称平面估计

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.39

R. Kakarala, P. Kaliamoorthi, Vittal Premachandran

引用次数: 18

Real-Time Model-Based Rigid Object Pose Estimation and Tracking Combining Dense and Sparse Visual Cues 结合密集与稀疏视觉线索的基于实时模型的刚体姿态估计与跟踪

2013 IEEE Conference on Computer Vision and Pattern Recognition Pub Date : 2013-06-23 DOI: 10.1109/CVPR.2013.304

Karl Pauwels, Leonardo Rubio, Javier Díaz, E. Ros

引用次数: 77