2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)最新文献

How Hard Can It Be? Estimating the Difficulty of Visual Search in an Image 能有多难呢?估计图像中视觉搜索的难度

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-12-12 DOI: 10.1109/CVPR.2016.237

Radu Tudor Ionescu, B. Alexe, Marius Leordeanu, M. Popescu, Dim P. Papadopoulos, V. Ferrari

引用次数: 112

Simultaneous Optical Flow and Intensity Estimation from an Event Camera 事件相机同时光流和光强估计

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-12-12 DOI: 10.1109/CVPR.2016.102

Patrick Bardow, A. Davison, Stefan Leutenegger

{"title":"Simultaneous Optical Flow and Intensity Estimation from an Event Camera","authors":"Patrick Bardow, A. Davison, Stefan Leutenegger","doi":"10.1109/CVPR.2016.102","DOIUrl":"https://doi.org/10.1109/CVPR.2016.102","url":null,"abstract":"Event cameras are bio-inspired vision sensors which mimic retinas to measure per-pixel intensity change rather than outputting an actual intensity image. This proposed paradigm shift away from traditional frame cameras offers significant potential advantages: namely avoiding high data rates, dynamic range limitations and motion blur. Unfortunately, however, established computer vision algorithms may not at all be applied directly to event cameras. Methods proposed so far to reconstruct images, estimate optical flow, track a camera and reconstruct a scene come with severe restrictions on the environment or on the motion of the camera, e.g. allowing only rotation. Here, we propose, to the best of our knowledge, the first algorithm to simultaneously recover the motion field and brightness image, while the camera undergoes a generic motion through any scene. Our approach employs minimisation of a cost function that contains the asynchronous event data as well as spatial and temporal regularisation within a sliding window time interval. Our implementation relies on GPU optimisation and runs in near real-time. In a series of examples, we demonstrate the successful operation of our framework, including in situations where conventional cameras suffer from dynamic range limitations and motion blur.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"43 1","pages":"884-892"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88286538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 220

Multivariate Regression on the Grassmannian for Predicting Novel Domains 格拉斯曼预测新领域的多元回归

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-12-12 DOI: 10.1109/CVPR.2016.548

Yongxin Yang, Timothy M. Hospedales

{"title":"Multivariate Regression on the Grassmannian for Predicting Novel Domains","authors":"Yongxin Yang, Timothy M. Hospedales","doi":"10.1109/CVPR.2016.548","DOIUrl":"https://doi.org/10.1109/CVPR.2016.548","url":null,"abstract":"We study the problem of predicting how to recognise visual objects in novel domains with neither labelled nor unlabelled training data. Domain adaptation is now an established research area due to its value in ameliorating the issue of domain shift between train and test data. However, it is conventionally assumed that domains are discrete entities, and that at least unlabelled data is provided in testing domains. In this paper, we consider the case where domains are parametrised by a vector of continuous values (e.g., time, lighting or view angle). We aim to use such domain metadata to predict novel domains for recognition. This allows a recognition model to be pre-calibrated for a new domain in advance (e.g., future time or view angle) without waiting for data collection and re-training. We achieve this by posing the problem as one of multivariate regression on the Grassmannian, where we regress a domain's subspace (point on the Grassmannian) against an independent vector of domain parameters. We derive two novel methodologies to achieve this challenging task: a direct kernel regression from RM ! G, and an indirect method with better extrapolation properties. We evaluate our methods on two crossdomain visual recognition benchmarks, where they perform close to the upper bound of full data domain adaptation. This demonstrates that data is not necessary for domain adaptation if a domain can be parametrically described.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"6 1","pages":"5071-5080"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84776732","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Sketch Me That Shoe 给我画那只鞋

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-12-12 DOI: 10.1109/CVPR.2016.93

Qian Yu, Feng Liu, Yi-Zhe Song, T. Xiang, Timothy M. Hospedales, Chen Change Loy

{"title":"Sketch Me That Shoe","authors":"Qian Yu, Feng Liu, Yi-Zhe Song, T. Xiang, Timothy M. Hospedales, Chen Change Loy","doi":"10.1109/CVPR.2016.93","DOIUrl":"https://doi.org/10.1109/CVPR.2016.93","url":null,"abstract":"We investigate the problem of fine-grained sketch-based image retrieval (SBIR), where free-hand human sketches are used as queries to perform instance-level retrieval of images. This is an extremely challenging task because (i) visual comparisons not only need to be fine-grained but also executed cross-domain, (ii) free-hand (finger) sketches are highly abstract, making fine-grained matching harder, and most importantly (iii) annotated cross-domain sketch-photo datasets required for training are scarce, challenging many state-of-the-art machine learning techniques. In this paper, for the first time, we address all these challenges, providing a step towards the capabilities that would underpin a commercial sketch-based image retrieval application. We introduce a new database of 1,432 sketchphoto pairs from two categories with 32,000 fine-grained triplet ranking annotations. We then develop a deep tripletranking model for instance-level SBIR with a novel data augmentation and staged pre-training strategy to alleviate the issue of insufficient fine-grained training data. Extensive experiments are carried out to contribute a variety of insights into the challenges of data sufficiency and over-fitting avoidance when training deep networks for finegrained cross-domain ranking tasks.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"15 1","pages":"799-807"},"PeriodicalIF":0.0,"publicationDate":"2016-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74513871","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 379

Discovering the Physical Parts of an Articulated Object Class from Multiple Videos 从多个视频中发现铰接对象类的物理部分

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-12-12 DOI: 10.1109/CVPR.2016.84

Luca Del Pero, Susanna Ricco, R. Sukthankar, V. Ferrari

引用次数: 12

Kinematic Structure Correspondences via Hypergraph Matching 基于超图匹配的运动结构对应

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-12-12 DOI: 10.1109/CVPR.2016.457

H. Chang, Tobias Fischer, Maxime Petit, Martina Zambelli, Y. Demiris

引用次数: 17

iLab-20M: A Large-Scale Controlled Object Dataset to Investigate Deep Learning iLab-20M:用于研究深度学习的大规模受控对象数据集

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-07-01 DOI: 10.1109/CVPR.2016.244

A. Borji, S. Izadi, L. Itti

{"title":"iLab-20M: A Large-Scale Controlled Object Dataset to Investigate Deep Learning","authors":"A. Borji, S. Izadi, L. Itti","doi":"10.1109/CVPR.2016.244","DOIUrl":"https://doi.org/10.1109/CVPR.2016.244","url":null,"abstract":"Tolerance to image variations (e.g., translation, scale, pose, illumination, background) is an important desired property of any object recognition system, be it human or machine. Moving towards increasingly bigger datasets has been trending in computer vision especially with the emergence of highly popular deep learning models. While being very useful for learning invariance to object inter-and intra-class shape variability, these large-scale wild datasets are not very useful for learning invariance to other parameters urging researchers to resort to other tricks for training models. In this work, we introduce a large-scale synthetic dataset, which is freely and publicly available, and use it to answer several fundamental questions regarding selectivity and invariance properties of convolutional neural networks. Our dataset contains two parts: a) objects shot on a turntable: 15 categories, 8 rotation angles, 11 cameras on a semi-circular arch, 5 lighting conditions, 3 focus levels, variety of backgrounds (23.4 per instance) generating 1320 images per instance (about 22 million images in total), and b) scenes: in which a robotic arm takes pictures of objects on a 1:160 scale scene. We study: 1) invariance and selectivity of different CNN layers, 2) knowledge transfer from one object category to another, 3) systematic or random sampling of images to build a train set, 4) domain adaptation from synthetic to natural scenes, and 5) order of knowledge delivery to CNNs. We also discuss how our analyses can lead the field to develop more efficient deep learning methods.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"24 1","pages":"2221-2230"},"PeriodicalIF":0.0,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84731860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 66

End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation 可变形零件混合的端到端学习与深度卷积神经网络人体姿态估计

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.335

Wei Yang, Wanli Ouyang, Hongsheng Li, Xiaogang Wang

{"title":"End-to-End Learning of Deformable Mixture of Parts and Deep Convolutional Neural Networks for Human Pose Estimation","authors":"Wei Yang, Wanli Ouyang, Hongsheng Li, Xiaogang Wang","doi":"10.1109/CVPR.2016.335","DOIUrl":"https://doi.org/10.1109/CVPR.2016.335","url":null,"abstract":"Recently, Deep Convolutional Neural Networks (DCNNs) have been applied to the task of human pose estimation, and have shown its potential of learning better feature representations and capturing contextual relationships. However, it is difficult to incorporate domain prior knowledge such as geometric relationships among body parts into DCNNs. In addition, training DCNN-based body part detectors without consideration of global body joint consistency introduces ambiguities, which increases the complexity of training. In this paper, we propose a novel end-to-end framework for human pose estimation that combines DCNNs with the expressive deformable mixture of parts. We explicitly incorporate domain prior knowledge into the framework, which greatly regularizes the learning process and enables the flexibility of our framework for loopy models or tree-structured models. The effectiveness of jointly learning a DCNN with a deformable mixture of parts model is evaluated through intensive experiments on several widely used benchmarks. The proposed approach significantly improves the performance compared with state-of-the-art approaches, especially on benchmarks with challenging articulations.","PeriodicalId":6515,"journal":{"name":"2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"75 1","pages":"3073-3082"},"PeriodicalIF":0.0,"publicationDate":"2016-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75702107","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 233

Visual Path Prediction in Complex Scenes with Crowded Moving Objects 具有拥挤运动物体的复杂场景的视觉路径预测

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.292

Y. Yoo, Kimin Yun, Sangdoo Yun, Jonghee Hong, Hawook Jeong, J. Choi

引用次数: 34

HD Maps: Fine-Grained Road Segmentation by Parsing Ground and Aerial Images 高清地图:通过解析地面和航空图像进行细粒度道路分割

2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2016-06-27 DOI: 10.1109/CVPR.2016.393

G. Máttyus, Shenlong Wang, S. Fidler, R. Urtasun

引用次数: 131