2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition最新文献

筛选
英文 中文
A Fast Resection-Intersection Method for the Known Rotation Problem 已知旋转问题的快速剖交法
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00318
Qianggong Zhang, Tat-Jun Chin, Huu Le
{"title":"A Fast Resection-Intersection Method for the Known Rotation Problem","authors":"Qianggong Zhang, Tat-Jun Chin, Huu Le","doi":"10.1109/CVPR.2018.00318","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00318","url":null,"abstract":"The known rotation problem refers to a special case of structure-from-motion where the absolute orientations of the cameras are known. When formulated as a minimax ($$) problem on reprojection errors, the problem is an instance of pseudo-convex programming. Though theoretically tractable, solving the known rotation problem on large-scale data (1,000's of views, 10,000's scene points) using existing methods can be very time-consuming. In this paper, we devise a fast algorithm for the known rotation problem. Our approach alternates between pose estimation and triangulation (i.e., resection-intersection) to break the problem into multiple simpler instances of pseudo-convex programming. The key to the vastly superior performance of our method lies in using a novel minimum enclosing ball (MEB) technique for the calculation of updating steps, which obviates the need for convex optimisation routines and greatly reduces memory footprint. We demonstrate the practicality of our method on large-scale problem instances which easily overwhelm current state-of-the-art algorithms.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84141661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Multi-level Fusion Based 3D Object Detection from Monocular Images 基于多层次融合的单目图像三维目标检测
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00249
Bin Xu, Zhenzhong Chen
{"title":"Multi-level Fusion Based 3D Object Detection from Monocular Images","authors":"Bin Xu, Zhenzhong Chen","doi":"10.1109/CVPR.2018.00249","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00249","url":null,"abstract":"In this paper, we present an end-to-end multi-level fusion based framework for 3D object detection from a single monocular image. The whole network is composed of two parts: one for 2D region proposal generation and another for simultaneously predictions of objects' 2D locations, orientations, dimensions, and 3D locations. With the help of a stand-alone module to estimate the disparity and compute the 3D point cloud, we introduce the multi-level fusion scheme. First, we encode the disparity information with a front view feature representation and fuse it with the RGB image to enhance the input. Second, features extracted from the original input and the point cloud are combined to boost the object detection. For 3D localization, we introduce an extra stream to predict the location information from point cloud directly and add it to the aforementioned location prediction. The proposed algorithm can directly output both 2D and 3D object detection results in an end-to-end fashion with only a single RGB image as the input. The experimental results on the challenging KITTI benchmark demonstrate that our algorithm significantly outperforms monocular state-of-the-art methods.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84166953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 260
Neural Style Transfer via Meta Networks 通过元网络的神经风格迁移
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00841
Falong Shen, Shuicheng Yan, Gang Zeng
{"title":"Neural Style Transfer via Meta Networks","authors":"Falong Shen, Shuicheng Yan, Gang Zeng","doi":"10.1109/CVPR.2018.00841","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00841","url":null,"abstract":"In this paper we propose a noval method to generate the specified network parameters through one feed-forward propagation in the meta networks for neural style transfer. Recent works on style transfer typically need to train image transformation networks for every new style, and the style is encoded in the network parameters by enormous iterations of stochastic gradient descent, which lacks the generalization ability to new style in the inference stage. To tackle these issues, we build a meta network which takes in the style image and generates a corresponding image transformation network directly. Compared with optimization-based methods for every style, our meta networks can handle an arbitrary new style within 19 milliseconds on one modern GPU card. The fast image transformation network generated by our meta network is only 449 KB, which is capable of real-time running on a mobile device. We also investigate the manifold of the style transfer networks by operating the hidden features from meta networks. Experiments have well validated the effectiveness of our method. Code and trained models will be released.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82966829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 105
Robust Hough Transform Based 3D Reconstruction from Circular Light Fields 基于鲁棒霍夫变换的圆形光场三维重建
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00765
A. Vianello, J. Ackermann, M. Diebold, B. Jähne
{"title":"Robust Hough Transform Based 3D Reconstruction from Circular Light Fields","authors":"A. Vianello, J. Ackermann, M. Diebold, B. Jähne","doi":"10.1109/CVPR.2018.00765","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00765","url":null,"abstract":"Light-field imaging is based on images taken on a regular grid. Thus, high-quality 3D reconstructions are obtainable by analyzing orientations in epipolar plane images (EPIs). Unfortunately, such data only allows to evaluate one side of the object. Moreover, a constant intensity along each orientation is mandatory for most of the approaches. This paper presents a novel method which allows to reconstruct depth information from data acquired with a circular camera motion, termed circular light fields. With this approach it is possible to determine the full 360° view of target objects. Additionally, circular light fields allow retrieving depth from datasets acquired with telecentric lenses, which is not possible with linear light fields. The proposed method finds trajectories of 3D points in the EPIs by means of a modified Hough transform. For this purpose, binary EPI-edge images are used, which not only allow to obtain reliable depth information, but also overcome the limitation of constant intensity along trajectories. Experimental results on synthetic and real datasets demonstrate the quality of the proposed algorithm.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80943086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections 使用语义检测增强众包3D重建
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00206
True Price, Johannes L. Schönberger, Zhen Wei, M. Pollefeys, Jan-Michael Frahm
{"title":"Augmenting Crowd-Sourced 3D Reconstructions Using Semantic Detections","authors":"True Price, Johannes L. Schönberger, Zhen Wei, M. Pollefeys, Jan-Michael Frahm","doi":"10.1109/CVPR.2018.00206","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00206","url":null,"abstract":"Image-based 3D reconstruction for Internet photo collections has become a robust technology to produce impressive virtual representations of real-world scenes. However, several fundamental challenges remain for Structure-from-Motion (SfM) pipelines, namely: the placement and reconstruction of transient objects only observed in single views, estimating the absolute scale of the scene, and (suprisingly often) recovering ground surfaces in the scene. We propose a method to jointly address these remaining open problems of SfM. In particular, we focus on detecting people in individual images and accurately placing them into an existing 3D model. As part of this placement, our method also estimates the absolute scale of the scene from object semantics, which in this case constitutes the height distribution of the population. Further, we obtain a smooth approximation of the ground surface and recover the gravity vector of the scene directly from the individual person detections. We demonstrate the results of our approach on a number of unordered Internet photo collections, and we quantitatively evaluate the obtained absolute scene scales.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80971681","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Divide and Conquer for Full-Resolution Light Field Deblurring 分而治之的全分辨率光场去模糊
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00672
M. Mohan, A. Rajagopalan
{"title":"Divide and Conquer for Full-Resolution Light Field Deblurring","authors":"M. Mohan, A. Rajagopalan","doi":"10.1109/CVPR.2018.00672","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00672","url":null,"abstract":"The increasing popularity of computational light field (LF) cameras has necessitated the need for tackling motion blur which is a ubiquitous phenomenon in hand-held photography. The state-of-the-art method for blind deblurring of LFs of general 3D scenes is limited to handling only downsampled LF, both in spatial and angular resolution. This is due to the computational overhead involved in processing data-hungry full-resolution 4D LF altogether. Moreover, the method warrants high-end GPUs for optimization and is ineffective for wide-angle settings and irregular camera motion. In this paper, we introduce a new blind motion deblurring strategy for LFs which alleviates these limitations significantly. Our model achieves this by isolating 4D LF motion blur across the 2D subaperture images, thus paving the way for independent deblurring of these subaperture images. Furthermore, our model accommodates common camera motion parameterization across the subaperture images. Consequently, blind deblurring of any single subaperture image elegantly paves the way for cost-effective non-blind deblurring of the other subaperture images. Our approach is CPU-efficient computationally and can effectively deblur full-resolution LFs.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90308082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
A Common Framework for Interactive Texture Transfer 交互式纹理传输的通用框架
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00665
Yifang Men, Z. Lian, Yingmin Tang, Jianguo Xiao
{"title":"A Common Framework for Interactive Texture Transfer","authors":"Yifang Men, Z. Lian, Yingmin Tang, Jianguo Xiao","doi":"10.1109/CVPR.2018.00665","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00665","url":null,"abstract":"In this paper, we present a general-purpose solution to interactive texture transfer problems that better preserves both local structure and visual richness. It is challenging due to the diversity of tasks and the simplicity of required user guidance. The core idea of our common framework is to use multiple custom channels to dynamically guide the synthesis process. For interactivity, users can control the spatial distribution of stylized textures via semantic channels. The structure guidance, acquired by two stages of automatic extraction and propagation of structure information, provides a prior for initialization and preserves the salient structure by searching the nearest neighbor fields (NNF) with structure coherence. Meanwhile, texture coherence is also exploited to maintain similar style with the source image. In addition, we leverage an improved PatchMatch with extended NNF and matrix operations to obtain transformable source patches with richer geometric information at high speed. We demonstrate the effectiveness and superiority of our method on a variety of scenes through extensive comparisons with state-of-the-art algorithms.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90954518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Local and Global Optimization Techniques in Graph-Based Clustering 基于图的聚类中的局部和全局优化技术
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00364
Daiki Ikami, T. Yamasaki, K. Aizawa
{"title":"Local and Global Optimization Techniques in Graph-Based Clustering","authors":"Daiki Ikami, T. Yamasaki, K. Aizawa","doi":"10.1109/CVPR.2018.00364","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00364","url":null,"abstract":"The goal of graph-based clustering is to divide a dataset into disjoint subsets with members similar to each other from an affinity (similarity) matrix between data. The most popular method of solving graph-based clustering is spectral clustering. However, spectral clustering has drawbacks. Spectral clustering can only be applied to macroaverage-based cost functions, which tend to generate undesirable small clusters. This study first introduces a novel cost function based on micro-average. We propose a local optimization method, which is widely applicable to graph-based clustering cost functions. We also propose an initial-guess-free algorithm to avoid its initialization dependency. Moreover, we present two global optimization techniques. The experimental results exhibit significant clustering performances from our proposed methods, including 100% clustering accuracy in the COIL-20 dataset.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91104404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Two-Step Quantization for Low-bit Neural Networks 低比特神经网络的两步量化
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00460
Peisong Wang, Qinghao Hu, Yifan Zhang, Chunjie Zhang, Yang Liu, Jian Cheng
{"title":"Two-Step Quantization for Low-bit Neural Networks","authors":"Peisong Wang, Qinghao Hu, Yifan Zhang, Chunjie Zhang, Yang Liu, Jian Cheng","doi":"10.1109/CVPR.2018.00460","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00460","url":null,"abstract":"Every bit matters in the hardware design of quantized neural networks. However, extremely-low-bit representation usually causes large accuracy drop. Thus, how to train extremely-low-bit neural networks with high accuracy is of central importance. Most existing network quantization approaches learn transformations (low-bit weights) as well as encodings (low-bit activations) simultaneously. This tight coupling makes the optimization problem difficult, and thus prevents the network from learning optimal representations. In this paper, we propose a simple yet effective Two-Step Quantization (TSQ) framework, by decomposing the network quantization problem into two steps: code learning and transformation function learning based on the learned codes. For the first step, we propose the sparse quantization method for code learning. The second step can be formulated as a non-linear least square regression problem with low-bit constraints, which can be solved efficiently in an iterative manner. Extensive experiments on CIFAR-10 and ILSVRC-12 datasets demonstrate that the proposed TSQ is effective and outperforms the state-of-the-art by a large margin. Especially, for 2-bit activation and ternary weight quantization of AlexNet, the accuracy of our TSQ drops only about 0.5 points compared with the full-precision counterpart, outperforming current state-of-the-art by more than 5 points.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80763609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net 速度与激情:实时端到端3D检测,跟踪和运动预测与单一卷积网络
2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Pub Date : 2018-06-01 DOI: 10.1109/CVPR.2018.00376
Wenjie Luo, Binh Yang, R. Urtasun
{"title":"Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolutional Net","authors":"Wenjie Luo, Binh Yang, R. Urtasun","doi":"10.1109/CVPR.2018.00376","DOIUrl":"https://doi.org/10.1109/CVPR.2018.00376","url":null,"abstract":"In this paper we propose a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor. By jointly reasoning about these tasks, our holistic approach is more robust to occlusion as well as sparse data at range. Our approach performs 3D convolutions across space and time over a bird's eye view representation of the 3D world, which is very efficient in terms of both memory and computation. Our experiments on a new very large scale dataset captured in several north american cities, show that we can outperform the state-of-the-art by a large margin. Importantly, by sharing computation we can perform all tasks in as little as 30 ms.","PeriodicalId":6564,"journal":{"name":"2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78450607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 540
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信