2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)最新文献_第4页

Learning the Multilinear Structure of Visual Data 学习视觉数据的多线性结构

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.641

Mengjiao MJ Wang, Yannis Panagakis, Patrick Snape, S. Zafeiriou

{"title":"Learning the Multilinear Structure of Visual Data","authors":"Mengjiao MJ Wang, Yannis Panagakis, Patrick Snape, S. Zafeiriou","doi":"10.1109/CVPR.2017.641","DOIUrl":"https://doi.org/10.1109/CVPR.2017.641","url":null,"abstract":"Statistical decomposition methods are of paramount importance in discovering the modes of variations of visual data. Probably the most prominent linear decomposition method is the Principal Component Analysis (PCA), which discovers a single mode of variation in the data. However, in practice, visual data exhibit several modes of variations. For instance, the appearance of faces varies in identity, expression, pose etc. To extract these modes of variations from visual data, several supervised methods, such as the TensorFaces, that rely on multilinear (tensor) decomposition (e.g., Higher Order SVD) have been developed. The main drawbacks of such methods is that they require both labels regarding the modes of variations and the same number of samples under all modes of variations (e.g., the same face under different expressions, poses etc.). Therefore, their applicability is limited to well-organised data, usually captured in well-controlled conditions. In this paper, we propose the first general multilinear method, to the best of our knowledge, that discovers the multilinear structure of visual data in unsupervised setting. That is, without the presence of labels. We demonstrate the applicability of the proposed method in two applications, namely Shape from Shading (SfS) and expression transfer.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"5 1","pages":"6053-6061"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76382666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Fine-Grained Recognition as HSnet Search for Informative Image Parts 基于HSnet的信息图像部分细粒度识别

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.688

Michael Lam, Behrooz Mahasseni, S. Todorovic

{"title":"Fine-Grained Recognition as HSnet Search for Informative Image Parts","authors":"Michael Lam, Behrooz Mahasseni, S. Todorovic","doi":"10.1109/CVPR.2017.688","DOIUrl":"https://doi.org/10.1109/CVPR.2017.688","url":null,"abstract":"This work addresses fine-grained image classification. Our work is based on the hypothesis that when dealing with subtle differences among object classes it is critical to identify and only account for a few informative image parts, as the remaining image context may not only be uninformative but may also hurt recognition. This motivates us to formulate our problem as a sequential search for informative parts over a deep feature map produced by a deep Convolutional Neural Network (CNN). A state of this search is a set of proposal bounding boxes in the image, whose informativeness is evaluated by the heuristic function (H), and used for generating new candidate states by the successor function (S). The two functions are unified via a Long Short-Term Memory network (LSTM) into a new deep recurrent architecture, called HSnet. Thus, HSnet (i) generates proposals of informative image parts and (ii) fuses all proposals toward final fine-grained recognition. We specify both supervised and weakly supervised training of HSnet depending on the availability of object part annotations. Evaluation on the benchmark Caltech-UCSD Birds 200-2011 and Cars-196 datasets demonstrate our competitive performance relative to the state of the art.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"15 1","pages":"6497-6506"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87117712","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 105

LCR-Net: Localization-Classification-Regression for Human Pose LCR-Net:人体姿态的定位-分类-回归

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.134

Grégory Rogez, Philippe Weinzaepfel, C. Schmid

{"title":"LCR-Net: Localization-Classification-Regression for Human Pose","authors":"Grégory Rogez, Philippe Weinzaepfel, C. Schmid","doi":"10.1109/CVPR.2017.134","DOIUrl":"https://doi.org/10.1109/CVPR.2017.134","url":null,"abstract":"We propose an end-to-end architecture for joint 2D and 3D human pose estimation in natural images. Key to our approach is the generation and scoring of a number of pose proposals per image, which allows us to predict 2D and 3D pose of multiple people simultaneously. Hence, our approach does not require an approximate localization of the humans for initialization. Our architecture, named LCR-Net, contains 3 main components: 1) the pose proposal generator that suggests potential poses at different locations in the image, 2) a classifier that scores the different pose proposals, and 3) a regressor that refines pose proposals both in 2D and 3D. All three stages share the convolutional feature layers and are trained jointly. The final pose estimation is obtained by integrating over neighboring pose hypotheses, which is shown to improve over a standard non maximum suppression algorithm. Our approach significantly outperforms the state of the art in 3D pose estimation on Human3.6M, a controlled environment. Moreover, it shows promising results on real images for both single and multi-person subsets of the MPII 2D pose benchmark.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"181 1","pages":"1216-1224"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85560699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 280

Snapshot Hyperspectral Light Field Imaging 快照高光谱光场成像

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.727

Zhiwei Xiong, Lizhi Wang, Huiqun Li, Dong Liu, Feng Wu

引用次数: 36

The Misty Three Point Algorithm for Relative Pose 相对姿态的模糊三点算法

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.484

Tobias Palmér, Kalle Åström, Jan-Michael Frahm

引用次数: 7

Real-Time 3D Model Tracking in Color and Depth on a Single CPU Core 实时3D模型跟踪的颜色和深度在一个单一的CPU核心

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.57

Wadim Kehl, Federico Tombari, Slobodan Ilic, Nassir Navab

引用次数: 32

Learning to Detect Salient Objects with Image-Level Supervision 学习用图像级监督检测显著物体

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.404

Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, D. Wang, Baocai Yin, Xiang Ruan

{"title":"Learning to Detect Salient Objects with Image-Level Supervision","authors":"Lijun Wang, Huchuan Lu, Yifan Wang, Mengyang Feng, D. Wang, Baocai Yin, Xiang Ruan","doi":"10.1109/CVPR.2017.404","DOIUrl":"https://doi.org/10.1109/CVPR.2017.404","url":null,"abstract":"Deep Neural Networks (DNNs) have substantially improved the state-of-the-art in salient object detection. However, training DNNs requires costly pixel-level annotations. In this paper, we leverage the observation that image-level tags provide important cues of foreground salient objects, and develop a weakly supervised learning method for saliency detection using image-level tags only. The Foreground Inference Network (FIN) is introduced for this challenging task. In the first stage of our training method, FIN is jointly trained with a fully convolutional network (FCN) for image-level tag prediction. A global smooth pooling layer is proposed, enabling FCN to assign object category tags to corresponding object regions, while FIN is capable of capturing all potential foreground regions with the predicted saliency maps. In the second stage, FIN is fine-tuned with its predicted saliency maps as ground truth. For refinement of ground truth, an iterative Conditional Random Field is developed to enforce spatial label consistency and further boost performance. Our method alleviates annotation efforts and allows the usage of existing large scale training sets with image-level tags. Our model runs at 60 FPS, outperforms unsupervised ones with a large margin, and achieves comparable or even superior performance than fully supervised counterparts.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"69 1","pages":"3796-3805"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90913385","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 821

L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space L2-Net:欧几里得空间中判别Patch描述符的深度学习

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.649

Yurun Tian, Bin Fan, Fuchao Wu

{"title":"L2-Net: Deep Learning of Discriminative Patch Descriptor in Euclidean Space","authors":"Yurun Tian, Bin Fan, Fuchao Wu","doi":"10.1109/CVPR.2017.649","DOIUrl":"https://doi.org/10.1109/CVPR.2017.649","url":null,"abstract":"The research focus of designing local patch descriptors has gradually shifted from handcrafted ones (e.g., SIFT) to learned ones. In this paper, we propose to learn high performance descriptor in Euclidean space via the Convolutional Neural Network (CNN). Our method is distinctive in four aspects: (i) We propose a progressive sampling strategy which enables the network to access billions of training samples in a few epochs. (ii) Derived from the basic concept of local patch matching problem, we empha-size the relative distance between descriptors. (iii) Extra supervision is imposed on the intermediate feature maps. (iv) Compactness of the descriptor is taken into account. The proposed network is named as L2-Net since the output descriptor can be matched in Euclidean space by L2 distance. L2-Net achieves state-of-the-art performance on the Brown datasets [16], Oxford dataset [18] and the newly proposed Hpatches dataset [11]. The good generalization ability shown by experiments indicates that L2-Net can serve as a direct substitution of the existing handcrafted descriptors. The pre-trained L2-Net is publicly available.","PeriodicalId":6631,"journal":{"name":"2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"1 1","pages":"6128-6136"},"PeriodicalIF":0.0,"publicationDate":"2017-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83200304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 425

Weakly Supervised Affordance Detection 弱监督可视性检测

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.552

Johann Sawatzky, A. Srikantha, Juergen Gall

引用次数: 66

Learned Contextual Feature Reweighting for Image Geo-Localization 图像地理定位的学习上下文特征重加权

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2017-07-21 DOI: 10.1109/CVPR.2017.346

Hyo Jin Kim, Enrique Dunn, Jan-Michael Frahm

引用次数: 167