2017 IEEE International Conference on Computer Vision (ICCV)最新文献_第8页

Robust Hand Pose Estimation during the Interaction with an Unknown Object 与未知物体交互过程中手部姿态的鲁棒估计

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.339

Chiho Choi, S. Yoon, China Chen, K. Ramani

引用次数: 53

Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks 伪三维残差网络的时空表征学习

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.590

Zhaofan Qiu, Ting Yao, Tao Mei

{"title":"Learning Spatio-Temporal Representation with Pseudo-3D Residual Networks","authors":"Zhaofan Qiu, Ting Yao, Tao Mei","doi":"10.1109/ICCV.2017.590","DOIUrl":"https://doi.org/10.1109/ICCV.2017.590","url":null,"abstract":"Convolutional Neural Networks (CNN) have been regarded as a powerful class of models for image recognition problems. Nevertheless, it is not trivial when utilizing a CNN for learning spatio-temporal video representation. A few studies have shown that performing 3D convolutions is a rewarding approach to capture both spatial and temporal dimensions in videos. However, the development of a very deep 3D CNN from scratch results in expensive computational cost and memory demand. A valid question is why not recycle off-the-shelf 2D networks for a 3D CNN. In this paper, we devise multiple variants of bottleneck building blocks in a residual learning framework by simulating 3 x 3 x 3 convolutions with 1 × 3 × 3 convolutional filters on spatial domain (equivalent to 2D CNN) plus 3 × 1 × 1 convolutions to construct temporal connections on adjacent feature maps in time. Furthermore, we propose a new architecture, named Pseudo-3D Residual Net (P3D ResNet), that exploits all the variants of blocks but composes each in different placement of ResNet, following the philosophy that enhancing structural diversity with going deep could improve the power of neural networks. Our P3D ResNet achieves clear improvements on Sports-1M video classification dataset against 3D CNN and frame-based 2D CNN by 5.3% and 1.8%, respectively. We further examine the generalization performance of video representation produced by our pre-trained P3D ResNet on five different benchmarks and three different tasks, demonstrating superior performances over several state-of-the-art techniques.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"7 1","pages":"5534-5542"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80574257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1420

Personalized Image Aesthetics 个性化形象美学

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.76

Jian Ren, Xiaohui Shen, Zhe L. Lin, R. Mech, D. Foran

{"title":"Personalized Image Aesthetics","authors":"Jian Ren, Xiaohui Shen, Zhe L. Lin, R. Mech, D. Foran","doi":"10.1109/ICCV.2017.76","DOIUrl":"https://doi.org/10.1109/ICCV.2017.76","url":null,"abstract":"Automatic image aesthetics rating has received a growing interest with the recent breakthrough in deep learning. Although many studies exist for learning a generic or universal aesthetics model, investigation of aesthetics models incorporating individual user’s preference is quite limited. We address this personalized aesthetics problem by showing that individual’s aesthetic preferences exhibit strong correlations with content and aesthetic attributes, and hence the deviation of individual’s perception from generic image aesthetics is predictable. To accommodate our study, we first collect two distinct datasets, a large image dataset from Flickr and annotated by Amazon Mechanical Turk, and a small dataset of real personal albums rated by owners. We then propose a new approach to personalized aesthetics learning that can be trained even with a small set of annotated images from a user. The approach is based on a residual-based model adaptation scheme which learns an offset to compensate for the generic aesthetics score. Finally, we introduce an active learning algorithm to optimize personalized aesthetics prediction for real-world application scenarios. Experiments demonstrate that our approach can effectively learn personalized aesthetics preferences, and outperforms existing methods on quantitative comparisons.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"17 1","pages":"638-647"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89508891","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 86

Going Unconstrained with Rolling Shutter Deblurring 不受约束地使用滚动快门去模糊

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.432

R. MaheshMohanM., A. Rajagopalan

引用次数: 13

A Microfacet-Based Reflectance Model for Photometric Stereo with Highly Specular Surfaces 基于微面的高镜面光度计立体反射模型

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.343

Lixiong Chen, Yinqiang Zheng, Boxin Shi, Art Subpa-Asa, Imari Sato

引用次数: 18

Surface Registration via Foliation 通过叶理进行表面配准

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.107

Xiaopeng Zheng, Chengfeng Wen, Na Lei, Ming Ma, X. Gu

引用次数: 8

VegFru: A Domain-Specific Dataset for Fine-Grained Visual Categorization VegFru:用于细粒度视觉分类的特定领域数据集

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.66

Saihui Hou, Yushan Feng, Zilei Wang

{"title":"VegFru: A Domain-Specific Dataset for Fine-Grained Visual Categorization","authors":"Saihui Hou, Yushan Feng, Zilei Wang","doi":"10.1109/ICCV.2017.66","DOIUrl":"https://doi.org/10.1109/ICCV.2017.66","url":null,"abstract":"In this paper, we propose a novel domain-specific dataset named VegFru for fine-grained visual categorization (FGVC). While the existing datasets for FGVC are mainly focused on animal breeds or man-made objects with limited labelled data, VegFru is a larger dataset consisting of vegetables and fruits which are closely associated with the daily life of everyone. Aiming at domestic cooking and food management, VegFru categorizes vegetables and fruits according to their eating characteristics, and each image contains at least one edible part of vegetables or fruits with the same cooking usage. Particularly, all the images are labelled hierarchically. The current version covers vegetables and fruits of 25 upper-level categories and 292 subordinate classes. And it contains more than 160,000 images in total and at least 200 images for each subordinate class. Accompanying the dataset, we also propose an effective framework called HybridNet to exploit the label hierarchy for FGVC. Specifically, multiple granularity features are first extracted by dealing with the hierarchical labels separately. And then they are fused through explicit operation, e.g., Compact Bilinear Pooling, to form a unified representation for the ultimate recognition. The experimental results on the novel VegFru, the public FGVC-Aircraft and CUB-200-2011 indicate that HybridNet achieves one of the top performance on these datasets. The dataset and code are available at https://github.com/ustc-vim/vegfru.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"68 1","pages":"541-549"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84197544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 84

DeepCD: Learning Deep Complementary Descriptors for Patch Representations DeepCD:学习补丁表示的深度互补描述符

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.359

Tsun-Yi Yang, Jo-Han Hsu, Yen-Yu Lin, Yung-Yu Chuang

引用次数: 38

Region-Based Correspondence Between 3D Shapes via Spatially Smooth Biclustering 基于空间平滑双聚类的三维形状区域对应

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.457

M. Denitto, S. Melzi, M. Bicego, U. Castellani, A. Farinelli, Mário A. T. Figueiredo, Yanir Kleiman, M. Ovsjanikov

{"title":"Region-Based Correspondence Between 3D Shapes via Spatially Smooth Biclustering","authors":"M. Denitto, S. Melzi, M. Bicego, U. Castellani, A. Farinelli, Mário A. T. Figueiredo, Yanir Kleiman, M. Ovsjanikov","doi":"10.1109/ICCV.2017.457","DOIUrl":"https://doi.org/10.1109/ICCV.2017.457","url":null,"abstract":"Region-based correspondence (RBC) is a highly relevant and non-trivial computer vision problem. Given two 3D shapes, RBC seeks segments/regions on these shapes that can be reliably put in correspondence. The problem thus consists both in finding the regions and determining the correspondences between them. This problem statement is similar to that of “biclustering ”, implying that RBC can be cast as a biclustering problem. Here, we exploit this implication by tackling RBC via a novel biclustering approach, called S4B (spatially smooth spike and slab biclustering), which: (i) casts the problem in a probabilistic low-rank matrix factorization perspective; (ii) uses a spike and slab prior to induce sparsity; (iii) is enriched with a spatial smoothness prior, based on geodesic distances, encouraging nearby vertices to belong to the same bicluster. This type of spatial prior cannot be used in classical biclustering techniques. We test the proposed approach on the FAUST dataset, outperforming both state-of-the-art RBC techniques and classical biclustering methods.","PeriodicalId":6559,"journal":{"name":"2017 IEEE International Conference on Computer Vision (ICCV)","volume":"93 1","pages":"4270-4279"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79433023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Monocular Video-Based Trailer Coupler Detection Using Multiplexer Convolutional Neural Network 基于多路卷积神经网络的单目视频拖车耦合器检测

2017 IEEE International Conference on Computer Vision (ICCV) Pub Date : 2017-10-01 DOI: 10.1109/ICCV.2017.584

Yousef Atoum, Joseph Roth, Michael Bliss, Wende Zhang, Xiaoming Liu

引用次数: 8