2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition最新文献_第9页

PointGrid: A Deep Network for 3D Shape Understanding PointGrid:用于三维形状理解的深度网络

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00959

Truc Le, Y. Duan

引用次数: 290

DenseASPP for Semantic Segmentation in Street Scenes 街景语义分割的DenseASPP

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00388

Maoke Yang, Kun Yu, Chi Zhang, Zhiwei Li, Kuiyuan Yang

{"title":"DenseASPP for Semantic Segmentation in Street Scenes","authors":"Maoke Yang, Kun Yu, Chi Zhang, Zhiwei Li, Kuiyuan Yang","doi":"10.1109/CVPR.2018.00388","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00388","url":null,"abstract":"Semantic image segmentation is a basic street scene understanding task in autonomous driving, where each pixel in a high resolution image is categorized into a set of semantic labels. Unlike other scenarios, objects in autonomous driving scene exhibit very large scale changes, which poses great challenges for high-level feature representation in a sense that multi-scale information must be correctly encoded. To remedy this problem, atrous convolution[14]was introduced to generate features with larger receptive fields without sacrificing spatial resolution. Built upon atrous convolution, Atrous Spatial Pyramid Pooling (ASPP)[2] was proposed to concatenate multiple atrous-convolved features using different dilation rates into a final feature representation. Although ASPP is able to generate multi-scale features, we argue the feature resolution in the scale-axis is not dense enough for the autonomous driving scenario. To this end, we propose Densely connected Atrous Spatial Pyramid Pooling (DenseASPP), which connects a set of atrous convolutional layers in a dense way, such that it generates multi-scale features that not only cover a larger scale range, but also cover that scale range densely, without significantly increasing the model size. We evaluate DenseASPP on the street scene benchmark Cityscapes[4] and achieve state-of-the-art performance.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"12 1","pages":"3684-3692"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76684713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1036

Mesoscopic Facial Geometry Inference Using Deep Neural Networks 基于深度神经网络的介观面部几何推理

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00877

Loc Huynh, Weikai Chen, Shunsuke Saito, Jun Xing, Koki Nagano, Andrew Jones, P. Debevec, Hao Li

{"title":"Mesoscopic Facial Geometry Inference Using Deep Neural Networks","authors":"Loc Huynh, Weikai Chen, Shunsuke Saito, Jun Xing, Koki Nagano, Andrew Jones, P. Debevec, Hao Li","doi":"10.1109/CVPR.2018.00877","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00877","url":null,"abstract":"We present a learning-based approach for synthesizing facial geometry at medium and fine scales from diffusely-lit facial texture maps. When applied to an image sequence, the synthesized detail is temporally coherent. Unlike current state-of-the-art methods [17, 5], which assume \"dark is deep\", our model is trained with measured facial detail collected using polarized gradient illumination in a Light Stage [20]. This enables us to produce plausible facial detail across the entire face, including where previous approaches may incorrectly interpret dark features as concavities such as at moles, hair stubble, and occluded pores. Instead of directly inferring 3D geometry, we propose to encode fine details in high-resolution displacement maps which are learned through a hybrid network adopting the state-of-the-art image-to-image translation network [29] and super resolution network [43]. To effectively capture geometric detail at both mid- and high frequencies, we factorize the learning into two separate sub-networks, enabling the full range of facial detail to be modeled. Results from our learning-based approach compare favorably with a high-quality active facial scanhening technique, and require only a single passive lighting condition without a complex scanning setup.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"52 1","pages":"8407-8416"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76900806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 63

Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections 使用语义检测增强众包3D重建

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00206

True Price, Johannes L. Schönberger, Zhen Wei, M. Pollefeys, Jan-Michael Frahm

{"title":"Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections","authors":"True Price, Johannes L. Schönberger, Zhen Wei, M. Pollefeys, Jan-Michael Frahm","doi":"10.1109/CVPR.2018.00206","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00206","url":null,"abstract":"Image-based 3D reconstruction for Internet photo collections has become a robust technology to produce impressive virtual representations of real-world scenes. However, several fundamental challenges remain for Structure-from-Motion (SfM) pipelines, namely: the placement and reconstruction of transient objects only observed in single views, estimating the absolute scale of the scene, and (suprisingly often) recovering ground surfaces in the scene. We propose a method to jointly address these remaining open problems of SfM. In particular, we focus on detecting people in individual images and accurately placing them into an existing 3D model. As part of this placement, our method also estimates the absolute scale of the scene from object semantics, which in this case constitutes the height distribution of the population. Further, we obtain a smooth approximation of the ground surface and recover the gravity vector of the scene directly from the individual person detections. We demonstrate the results of our approach on a number of unordered Internet photo collections, and we quantitatively evaluate the obtained absolute scene scales.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"96 1","pages":"1926-1935"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80971681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

High Performance Visual Tracking with Siamese Region Proposal Network 基于暹罗区域建议网络的高性能视觉跟踪

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00935

Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, Xiaolin Hu

引用次数: 1754

Robust Hough Transform Based 3D Reconstruction from Circular Light Fields 基于鲁棒霍夫变换的圆形光场三维重建

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00765

A. Vianello, J. Ackermann, M. Diebold, B. Jähne

{"title":"Robust Hough Transform Based 3D Reconstruction from Circular Light Fields","authors":"A. Vianello, J. Ackermann, M. Diebold, B. Jähne","doi":"10.1109/CVPR.2018.00765","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00765","url":null,"abstract":"Light-field imaging is based on images taken on a regular grid. Thus, high-quality 3D reconstructions are obtainable by analyzing orientations in epipolar plane images (EPIs). Unfortunately, such data only allows to evaluate one side of the object. Moreover, a constant intensity along each orientation is mandatory for most of the approaches. This paper presents a novel method which allows to reconstruct depth information from data acquired with a circular camera motion, termed circular light fields. With this approach it is possible to determine the full 360Â° view of target objects. Additionally, circular light fields allow retrieving depth from datasets acquired with telecentric lenses, which is not possible with linear light fields. The proposed method finds trajectories of 3D points in the EPIs by means of a modified Hough transform. For this purpose, binary EPI-edge images are used, which not only allow to obtain reliable depth information, but also overcome the limitation of constant intensity along trajectories. Experimental results on synthetic and real datasets demonstrate the quality of the proposed algorithm.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"105 1","pages":"7327-7335"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80943086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 9

RayNet: Learning Volumetric 3D Reconstruction with Ray Potentials RayNet:学习用射线势进行体积三维重建

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00410

Despoina Paschalidou, Ali O. Ulusoy, Carolin Schmitt, L. Gool, Andreas Geiger

引用次数: 77

Divide and Conquer for Full-Resolution Light Field Deblurring 分而治之的全分辨率光场去模糊

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00672

M. Mohan, A. Rajagopalan

{"title":"Divide and Conquer for Full-Resolution Light Field Deblurring","authors":"M. Mohan, A. Rajagopalan","doi":"10.1109/CVPR.2018.00672","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00672","url":null,"abstract":"The increasing popularity of computational light field (LF) cameras has necessitated the need for tackling motion blur which is a ubiquitous phenomenon in hand-held photography. The state-of-the-art method for blind deblurring of LFs of general 3D scenes is limited to handling only downsampled LF, both in spatial and angular resolution. This is due to the computational overhead involved in processing data-hungry full-resolution 4D LF altogether. Moreover, the method warrants high-end GPUs for optimization and is ineffective for wide-angle settings and irregular camera motion. In this paper, we introduce a new blind motion deblurring strategy for LFs which alleviates these limitations significantly. Our model achieves this by isolating 4D LF motion blur across the 2D subaperture images, thus paving the way for independent deblurring of these subaperture images. Furthermore, our model accommodates common camera motion parameterization across the subaperture images. Consequently, blind deblurring of any single subaperture image elegantly paves the way for cost-effective non-blind deblurring of the other subaperture images. Our approach is CPU-efficient computationally and can effectively deblur full-resolution LFs.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":"30 1","pages":"6421-6429"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90308082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 8

Monocular Relative Depth Perception with Web Stereo Data Supervision 基于网络立体数据监督的单眼相对深度感知

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00040

Ke Xian, Chunhua Shen, ZHIGUO CAO, Hao Lu, Yang Xiao, Ruibo Li, Zhenbo Luo

引用次数: 154

A Revised Underwater Image Formation Model 一种改进的水下图像形成模型

2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00703

D. Akkaynak, T. Treibitz

引用次数: 172