2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)最新文献_第10页

Visual Localization by Learning Objects-Of-Interest Dense Match Regression 基于学习感兴趣对象密集匹配回归的视觉定位

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.00578

Philippe Weinzaepfel, G. Csurka, Yohann Cabon, M. Humenberger

{"title":"Visual Localization by Learning Objects-Of-Interest Dense Match Regression","authors":"Philippe Weinzaepfel, G. Csurka, Yohann Cabon, M. Humenberger","doi":"10.1109/CVPR.2019.00578","DOIUrl":"https://doi.org/10.1109/CVPR.2019.00578","url":null,"abstract":"We introduce a novel CNN-based approach for visual localization from a single RGB image that relies on densely matching a set of Objects-of-Interest (OOIs). In this paper, we focus on planar objects which are highly descriptive in an environment, such as paintings in museums or logos and storefronts in malls or airports. For each OOI, we define a reference image for which 3D world coordinates are available. Given a query image, our CNN model detects the OOIs, segments them and finds a dense set of 2D-2D matches between each detected OOI and its corresponding reference image. Given these 2D-2D matches, together with the 3D world coordinates of each reference image, we obtain a set of 2D-3D matches from which solving a Perspective-n-Point problem gives a pose estimate. We show that 2D-3D matches for reference images, as well as OOI annotations can be obtained for all training images from a single instance annotation per OOI by leveraging Structure-from-Motion reconstruction. We introduce a novel synthetic dataset, VirtualGallery, which targets challenges such as varying lighting conditions and different occlusion levels. Our results show that our method achieves high precision and is robust to these challenges. We also experiment using the Baidu localization dataset captured in a shopping mall. Our approach is the first deep regression-based method to scale to such a larger environment.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"27 1","pages":"5627-5636"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80314914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 34

Learning RoI Transformer for Oriented Object Detection in Aerial Images 航空图像中定向目标检测的RoI变换学习

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.00296

Jian Ding, Nan Xue, Yang Long, Guisong Xia, Qikai Lu

{"title":"Learning RoI Transformer for Oriented Object Detection in Aerial Images","authors":"Jian Ding, Nan Xue, Yang Long, Guisong Xia, Qikai Lu","doi":"10.1109/CVPR.2019.00296","DOIUrl":"https://doi.org/10.1109/CVPR.2019.00296","url":null,"abstract":"Object detection in aerial images is an active yet challenging task in computer vision because of the bird’s-eye view perspective, the highly complex backgrounds, and the variant appearances of objects. Especially when detecting densely packed objects in aerial images, methods relying on horizontal proposals for common object detection often introduce mismatches between the Region of Interests (RoIs) and objects. This leads to the common misalignment between the final object classification confidence and localization accuracy. In this paper, we propose a RoI Transformer to address these problems. The core idea of RoI Transformer is to apply spatial transformations on RoIs and learn the transformation parameters under the supervision of oriented bounding box (OBB) annotations. RoI Transformer is with lightweight and can be easily embedded into detectors for oriented object detection. Simply apply the RoI Transformer to light head RCNN has achieved state-of-the-art performances on two common and challenging aerial datasets, i.e., DOTA and HRSC2016, with a neglectable reduction to detection speed. Our RoI Transformer exceeds the deformable Position Sensitive RoI pooling when oriented bounding-box annotations are available. Extensive experiments have also validated the flexibility and effectiveness of our RoI Transformer.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"18 1","pages":"2844-2853"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83039855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 521

Spectral Reconstruction From Dispersive Blur: A Novel Light Efficient Spectral Imager 色散模糊光谱重建:一种新型光效光谱成像仪

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.01248

Yuanyuan Zhao, Xue-mei Hu, Hui Guo, Zhan Ma, Tao Yue, Xun Cao

{"title":"Spectral Reconstruction From Dispersive Blur: A Novel Light Efficient Spectral Imager","authors":"Yuanyuan Zhao, Xue-mei Hu, Hui Guo, Zhan Ma, Tao Yue, Xun Cao","doi":"10.1109/CVPR.2019.01248","DOIUrl":"https://doi.org/10.1109/CVPR.2019.01248","url":null,"abstract":"Developing high light efficiency imaging techniques to retrieve high dimensional optical signal is a long-term goal in computational photography. Multispectral imaging, which captures images of different wavelengths and boosting the abilities for revealing scene properties, has developed rapidly in the last few decades. From scanning method to snapshot imaging, the limit of light collection efficiency is kept being pushed which enables wider applications especially under the light-starved scenes. In this work, we propose a novel multispectral imaging technique, that could capture the multispectral images with a high light efficiency. Through investigating the dispersive blur caused by spectral dispersers and introducing the difference of blur (DoB) constraints, we propose a basic theory for capturing multispectral information from a single dispersive-blurred image and an additional spectrum of an arbitrary point in the scene. Based on the theory, we design a prototype system and develop an optimization algorithm to realize snapshot multispectral imaging. The effectiveness of the proposed method is verified on both the synthetic data and real captured images.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"76 1","pages":"12194-12203"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89520518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning 基于多尺度时空推理的视频社会关系识别

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.00368

Xinchen Liu, Wu Liu, Meng Zhang, Jingwen Chen, Lianli Gao, C. Yan, Tao Mei

{"title":"Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning","authors":"Xinchen Liu, Wu Liu, Meng Zhang, Jingwen Chen, Lianli Gao, C. Yan, Tao Mei","doi":"10.1109/CVPR.2019.00368","DOIUrl":"https://doi.org/10.1109/CVPR.2019.00368","url":null,"abstract":"Discovering social relations, e.g., kinship, friendship, etc., from visual contents can make machines better interpret the behaviors and emotions of human beings. Existing studies mainly focus on recognizing social relations from still images while neglecting another important media--video. On one hand, the actions and storylines in videos provide more important cues for social relation recognition. On the other hand, the key persons may appear at arbitrary spatial-temporal locations, even not in one same image from beginning to the end. To overcome these challenges, we propose a Multi-scale Spatial-Temporal Reasoning (MSTR) framework to recognize social relations from videos. For the spatial representation, we not only adopt a temporal segment network to learn global action and scene information, but also design a Triple Graphs model to capture visual relations between persons and objects. For the temporal domain, we propose a Pyramid Graph Convolutional Network to perform temporal reasoning with multi-scale receptive fields, which can obtain both long-term and short-term storylines in videos. By this means, MSTR can comprehensively explore the multi-scale actions and storylines in spatial-temporal dimensions for social relation reasoning in videos. Extensive experiments on a new large-scale Video Social Relation dataset demonstrate the effectiveness of the proposed framework.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"104 1","pages":"3561-3569"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87486283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 60

Instance Segmentation by Jointly Optimizing Spatial Embeddings and Clustering Bandwidth 联合优化空间嵌入和聚类带宽的实例分割

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.00904

D. Neven, Bert De Brabandere, M. Proesmans, L. Gool

引用次数: 215

Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning 数字病理学图谱:用于深度学习的广义分层组织类型注释数据库

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.01202

M. S. Hosseini, Lyndon Chan, Gabriel Tse, M. Tang, J. Deng, Sajad Norouzi, C. Rowsell, K. Plataniotis, S. Damaskinos

{"title":"Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning","authors":"M. S. Hosseini, Lyndon Chan, Gabriel Tse, M. Tang, J. Deng, Sajad Norouzi, C. Rowsell, K. Plataniotis, S. Damaskinos","doi":"10.1109/CVPR.2019.01202","DOIUrl":"https://doi.org/10.1109/CVPR.2019.01202","url":null,"abstract":"In recent years, computer vision techniques have made large advances in image recognition and been applied to aid radiological diagnosis. Computational pathology aims to develop similar tools for aiding pathologists in diagnosing digitized histopathological slides, which would improve diagnostic accuracy and productivity amidst increasing workloads. However, there is a lack of publicly-available databases of (1) localized patch-level images annotated with (2) a large range of Histological Tissue Type (HTT). As a result, computational pathology research is constrained to diagnosing specific diseases or classifying tissues from specific organs, and cannot be readily generalized to handle unexpected diseases and organs. In this paper, we propose a new digital pathology database, the ``Atlas of Digital Pathology'' (or ADP), which comprises of 17,668 patch images extracted from 100 slides annotated with up to 57 hierarchical HTTs. Our data is generalized to different tissue types across different organs and aims to provide training data for supervised multi-label learning of patch-level HTT in a digitized whole slide image. We demonstrate the quality of our image labels through pathologist consultation and by training three state-of-the-art neural networks on tissue type classification. Quantitative results support the visually consistency of our data and we demonstrate a tissue type-based visual attention aid as a sample tool that could be developed from our database.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"31 1","pages":"11739-11748"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89642913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

PIEs: Pose Invariant Embeddings 姿态不变嵌入

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.01266

Chih-Hui Ho, Pedro Morgado, Amir Persekian, N. Vasconcelos

{"title":"PIEs: Pose Invariant Embeddings","authors":"Chih-Hui Ho, Pedro Morgado, Amir Persekian, N. Vasconcelos","doi":"10.1109/CVPR.2019.01266","DOIUrl":"https://doi.org/10.1109/CVPR.2019.01266","url":null,"abstract":"The role of pose invariance in image recognition and retrieval is studied. A taxonomic classification of embeddings, according to their level of invariance, is introduced and used to clarify connections between existing embeddings, identify missing approaches, and propose invariant generalizations. This leads to a new family of pose invariant embeddings (PIEs), derived from existing approaches by a combination of two models, which follow from the interpretation of CNNs as estimators of class posterior probabilities: a view-to-object model and an object-to-class model. The new pose-invariant models are shown to have interesting properties, both theoretically and through experiments, where they outperform existing multiview approaches. Most notably, they achieve good performance for both 1) classification and retrieval, and 2) single and multiview inference. These are important properties for the design of real vision systems, where universal embeddings are preferable to task specific ones, and multiple images are usually not available at inference time. Finally, a new multiview dataset of real objects, imaged in the wild against complex backgrounds, is introduced. We believe that this is a much needed complement to the synthetic datasets in wide use and will contribute to the advancement of multiview recognition and retrieval.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"18 1","pages":"12369-12378"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89903885","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition 球面分形卷积神经网络在点云识别中的应用

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.00054

Yongming Rao, Jiwen Lu, Jie Zhou

引用次数: 122

Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification 基于度量融合的目标检索和人物再识别排序

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.00083

S. Bai, Peng Tang, Philip H. S. Torr, Longin Jan Latecki

{"title":"Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification","authors":"S. Bai, Peng Tang, Philip H. S. Torr, Longin Jan Latecki","doi":"10.1109/CVPR.2019.00083","DOIUrl":"https://doi.org/10.1109/CVPR.2019.00083","url":null,"abstract":"This work studies the unsupervised re-ranking procedure for object retrieval and person re-identification with a specific concentration on an ensemble of multiple metrics (or similarities). While the re-ranking step is involved by running a diffusion process on the underlying data manifolds, the fusion step can leverage the complementarity of multiple metrics. We give a comprehensive summary of existing fusion with diffusion strategies, and systematically analyze their pros and cons. Based on the analysis, we propose a unified yet robust algorithm which inherits their advantages and discards their disadvantages. Hence, we call it Unified Ensemble Diffusion (UED). More interestingly, we derive that the inherited properties indeed stem from a theoretical framework, where the relevant works can be elegantly summarized as special cases of UED by imposing additional constraints on the objective function and varying the solver of similarity propagation. Extensive experiments with 3D shape retrieval, image retrieval and person re-identification demonstrate that the proposed framework outperforms the state of the arts, and at the same time suggest that re-ranking via metric fusion is a promising tool to further improve the retrieval performance of existing algorithms.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"73 1","pages":"740-749"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86352676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 80

Spatially Variant Linear Representation Models for Joint Filtering 联合滤波的空间变线性表示模型

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2019-06-01 DOI: 10.1109/CVPR.2019.00180

Jin-shan Pan, Jiangxin Dong, Jimmy S. J. Ren, Liang Lin, Jinhui Tang, Ming-Hsuan Yang

{"title":"Spatially Variant Linear Representation Models for Joint Filtering","authors":"Jin-shan Pan, Jiangxin Dong, Jimmy S. J. Ren, Liang Lin, Jinhui Tang, Ming-Hsuan Yang","doi":"10.1109/CVPR.2019.00180","DOIUrl":"https://doi.org/10.1109/CVPR.2019.00180","url":null,"abstract":"Joint filtering mainly uses an additional guidance image as a prior and transfers its structures to the target image in the filtering process. Different from existing algorithms that rely on locally linear models or hand-designed objective functions to extract the structural information from the guidance image, we propose a new joint filter based on a spatially variant linear representation model (SVLRM), where the target image is linearly represented by the guidance image. However, the SVLRM leads to a highly ill-posed problem. To estimate the linear representation coefficients, we develop an effective algorithm based on a deep convolutional neural network (CNN). The proposed deep CNN (constrained by the SVLRM) is able to estimate the spatially variant linear representation coefficients which are able to model the structural information of both the guidance and input images. We show that the proposed algorithm can be effectively applied to a variety of applications, including depth/RGB image upsampling and restoration, flash/no-flash image deblurring, natural image denoising, scale-aware filtering, etc. Extensive experimental results demonstrate that the proposed algorithm performs favorably against state-of-the-art methods that have been specially designed for each task.","PeriodicalId":6711,"journal":{"name":"2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"8 1","pages":"1702-1711"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82880234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 30