2020 International Conference on 3D Vision (3DV)最新文献

筛选
英文 中文
Differential Photometric Consistency 差光度一致性
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00023
Hongyi Fan, B. Kunsberg, B. Kimia
{"title":"Differential Photometric Consistency","authors":"Hongyi Fan, B. Kunsberg, B. Kimia","doi":"10.1109/3DV50981.2020.00023","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00023","url":null,"abstract":"A key bottleneck in the use of Multiview Stereo (MVS) to produce high quality reconstructions is the gaps arising from textureless, shaded areas and lack of fine-scale detail. Shape-from-Shading (SfS) has been used in conjunction with MVS to obtain fine-scale detail and veridical reconstruction in the gap areas. The similarity metric that gauges candidate correspondences is critical to this process, typically a combination of photometric consistency and brightness gradient constancy. Two observations motivate this paper. First, brightness gradient constancy can be erroneous due to foreshortening. Second, the standard ZSSD/NCC patchwise photometric consistency measures when applied to shaded areas is, to a first-order approximation, a calculation of brightness gradient differences, which can be subject to foreshortening. The paper proposes a novel trinocular differential photometric consistency that constrains the brightness gradients in three views so that the image gradient in one view is completely determined by the image gradients at corresponding points in the the other two views. The theoretical developments here advocate the integration of this new measure, whose viability in practice has been demonstrated in a set of illustrative numerical experiments.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116861183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty RidgeSfM:基于深度不确定性的鲁棒成对匹配的运动结构
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00075
Benjamin Graham, David Novotný
{"title":"RidgeSfM: Structure from Motion via Robust Pairwise Matching Under Depth Uncertainty","authors":"Benjamin Graham, David Novotný","doi":"10.1109/3DV50981.2020.00075","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00075","url":null,"abstract":"We consider the problem of simultaneously estimating a dense depth map and camera pose for a large set of images of an indoor scene. While classical SfM pipelines rely on a two-step approach where cameras are first estimated using a bundle adjustment in order to ground the ensuing multi-view stereo stage, both our poses and dense reconstructions are a direct output of an altered bundle adjuster. To this end, we parametrize each depth map with a linear combination of a limited number of basis “depth-planes” predicted in a monocular fashion by a deep net. Using a set of high-quality sparse keypoint matches, we optimize over the per-frame linear combinations of depth planes and camera poses to form a geometrically consistent cloud of keypoints. Although our bundle adjustment only considers sparse keypoints, the inferred linear coefficients of the basis planes immediately give us dense depth maps. RidgeSfM is able to collectively align hundreds of frames, which is its main advantage over recent memory-heavy deep alternatives that are typically capable of aligning no more than 10 frames. Quantitative comparisons reveal performance superior to a state-of-the-art large-scale SfM pipeline.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121033629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
FC-vSLAM: Integrating Feature Credibility in Visual SLAM FC-vSLAM:集成视觉SLAM中的特征可信度
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00106
Shuai Xie, Wei Ma, Qiuyuan Wang, Ruchang Xu, H. Zha
{"title":"FC-vSLAM: Integrating Feature Credibility in Visual SLAM","authors":"Shuai Xie, Wei Ma, Qiuyuan Wang, Ruchang Xu, H. Zha","doi":"10.1109/3DV50981.2020.00106","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00106","url":null,"abstract":"Feature-based visual SLAM (vSLAM) systems compute camera poses and scene maps by detecting and matching 2D features, mostly being points and line segments, from image sequences. These systems often suffer from unreliable detections. In this paper, we define feature credibility (FC) for both points and line segments, formulate it into vSLAMs and develop an FC-vSLAM system based on the widely used ORB-SLAM framework. Compared with existing credibility definitions, the proposed one, considering both temporal observation stability and perspective triangulation reliability, is more comprehensive. We formulate the credibility in our SLAM system to suppress the influences from unreliable features on the pose and map optimization. We also present a way to improve the line end observations by their multi-view correspondences, to improve the integrity of the 3D maps. Experiments on both the TUM and 7-Scenes datasets demonstrate that our feature credibility and the multi-view line optimization are effective; the developed FC-vSLAM system outperforms existing popular feature-based systems in both localization and mapping.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126094256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep LiDAR localization using optical flow sensor-map correspondences 使用光流传感器-地图对应的深度激光雷达定位
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00094
Anders Sunegård, L. Svensson, Torsten Sattler
{"title":"Deep LiDAR localization using optical flow sensor-map correspondences","authors":"Anders Sunegård, L. Svensson, Torsten Sattler","doi":"10.1109/3DV50981.2020.00094","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00094","url":null,"abstract":"In this paper we propose a method for accurate localization of a multi-layer LiDAR sensor in a pre-recorded map, given a coarse initialization pose. The foundation of the algorithm is the usage of neural network optical flow predictions. We train a network to encode representations of the sensor measurement and the map, and then regress flow vectors at each spatial position in the sensor feature map. The flow regression network is straight-forward to train, and the resulting flow field can be used with standard techniques for computing sensor pose from sensor-to-map correspondences. Additionally, the network can regress flow at different spatial scales, which means that it is able to handle both position recovery and high accuracy localization. We demonstrate average localization accuracy of $lt 0.04{mathrm {m}}$ position and $lt 0.1^{circ}$ heading angle for a vehicle driving application with simulated LiDAR measurements, which is similar to point-to-point iterative closest point (ICP). The algorithm typically manages to recover position with prior error of more than 20m and is significantly more robust to scenes with non-salient or repetitive structure than the baselines used for comparison.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123565908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Learning Wasserstein Isometric Embedding for Point Clouds 点云的Wasserstein等距嵌入学习
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00057
Keisuke Kawano, Satoshi Koide, Takuro Kutsuna
{"title":"Learning Wasserstein Isometric Embedding for Point Clouds","authors":"Keisuke Kawano, Satoshi Koide, Takuro Kutsuna","doi":"10.1109/3DV50981.2020.00057","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00057","url":null,"abstract":"The Wasserstein distance has been employed for determining the distance between point clouds, which have variable numbers of points and invariance of point order. However, the high computational cost associated with the Wasserstein distance hinders its practical applications for large-scale datasets. We propose a new embedding method for point clouds, which aims to embed point clouds into a Euclidean space, isometric to the Wasserstein space defined on the point clouds. In numerical experiments, we demonstrate that the point clouds decoded from the Euclidean averages and the interpolations in the embedding space accurately mimic the Wasserstein barycenters and interpolations of the point clouds. Furthermore, we show that the embedding vectors can be utilized as inputs for machine learning models (e.g., principal component analysis and neural networks).","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129806098","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Localising In Complex Scenes Using Balanced Adversarial Adaptation 在复杂场景中使用平衡对抗适应进行定位
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00116
Gil Avraham, Yan Zuo, T. Drummond
{"title":"Localising In Complex Scenes Using Balanced Adversarial Adaptation","authors":"Gil Avraham, Yan Zuo, T. Drummond","doi":"10.1109/3DV50981.2020.00116","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00116","url":null,"abstract":"Domain adaptation and generative modelling have collectively mitigated the expensive nature of data collection and labelling by leveraging the rich abundance of accurate, labelled data in simulation environments. In this work, we study the performance gap that exists between representations optimised for localisation on simulation environments and the application of such representations in a real-world setting. Our method exploits the shared geometric similarities between simulation and real-world environments whilst maintaining invariance towards visual discrepancies. This is achieved by optimising a representation extractor to project both simulated and real representations into a shared representation space. Our method uses a symmetrical adversarial approach which encourages the representation extractor to conceal the domain that features are extracted from and simultaneously preserves robust attributes between source and target domains that are beneficial for localisation. We evaluate our method by adapting representations optimised for indoor Habitat simulated environments (Matterport3D and Replica) to a real-world indoor environment (Active Vision Dataset), showing that it compares favourably against fully-supervised approaches.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129243314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Deep Learning Based Single-Photon 3D Imaging with Multiple Returns 基于深度学习的多收益单光子三维成像
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00130
Hao Tan, Jiayong Peng, Zhiwei Xiong, Dong Liu, Xin Huang, Zheng-Ping Li, Yu Hong, Feihu Xu
{"title":"Deep Learning Based Single-Photon 3D Imaging with Multiple Returns","authors":"Hao Tan, Jiayong Peng, Zhiwei Xiong, Dong Liu, Xin Huang, Zheng-Ping Li, Yu Hong, Feihu Xu","doi":"10.1109/3DV50981.2020.00130","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00130","url":null,"abstract":"photon avalanche diode (SPAD) has been widely used in active 3D imaging due to its extremely high photon sensitivity and picosecond time resolution. However, long-range active 3D imaging is still a great challenge, since only a few signal photons mixed with strong background noise can return from multiple reflectors of the scene due to the divergence of the light beam and the receiver’s field of view (FoV), which would bring considerable distortion and blur to the recovered depth map. In this paper, we propose a deep learning based depth reconstruction method for long range single-photon 3D imaging where the “multiple-returns” issue exists. Specifically, we model this problem as a deblurring task and design a multi-scale convolutional neural network combined with elaborate loss functions, which promote the reconstruction of an accurate depth map with fine details and clear boundaries of objects. The proposed method achieves superior performance over several different sizes of receiver’s FoV on a synthetic dataset compared with existing state-of-the-art methods and the trained model under a specific FoV has a strong generalization capability across different sizes of FoV, which is essential for practical applications. Moreover, we conduct outdoor experiments and demonstrate the effectiveness of our method in a real-world long range imaging system.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121771878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Benchmarking Image Retrieval for Visual Localization 面向视觉定位的基准图像检索
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00058
No'e Pion, M. Humenberger, G. Csurka, Yohann Cabon, Torsten Sattler
{"title":"Benchmarking Image Retrieval for Visual Localization","authors":"No'e Pion, M. Humenberger, G. Csurka, Yohann Cabon, Torsten Sattler","doi":"10.1109/3DV50981.2020.00058","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00058","url":null,"abstract":"Visual localization, i.e., camera pose estimation in a known scene, is a core component of technologies such as autonomous driving and augmented reality. State-of-the-art localization approaches often rely on image retrieval techniques for one of two tasks: (1) provide an approximate pose estimate or (2) determine which parts of the scene are potentially visible in a given query image. It is common practice to use state-of-the-art image retrieval algorithms for these tasks. These algorithms are often trained for the goal of retrieving the same landmark under a large range of viewpoint changes. However, robustness to viewpoint changes is not necessarily desirable in the context of visual localization. This paper focuses on understanding the role of image retrieval for multiple visual localization tasks. We introduce a benchmark setup and compare state-of-the-art retrieval representations on multiple datasets. We show that retrieval performance on classical landmark retrieval/recognition tasks correlates only for some but not all tasks to localization performance. This indicates a need for retrieval approaches specifically designed for localization tasks. Our benchmark and evaluation protocols are available at https://github.com/naver/kapture-localization.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132479216","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Fast Simultaneous Gravitational Alignment of Multiple Point Sets 多点集快速同步重力对准
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00019
Vladislav Golyanik, Soshi Shimada, C. Theobalt
{"title":"Fast Simultaneous Gravitational Alignment of Multiple Point Sets","authors":"Vladislav Golyanik, Soshi Shimada, C. Theobalt","doi":"10.1109/3DV50981.2020.00019","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00019","url":null,"abstract":"The problem of simultaneous rigid alignment of multiple unordered point sets which is unbiased towards any of the inputs has recently attracted increasing interest, and several reliable methods have been newly proposed. While being remarkably robust towards noise and clustered outliers, current approaches require sophisticated initialisation schemes and do not scale well to large point sets. This paper proposes a new resilient technique for simultaneous registration of multiple point sets by interpreting the latter as particle swarms rigidly moving in the mutually induced force fields. Thanks to the improved simulation with altered physical laws and acceleration of globally multiply-linked point interactions with a 2D-tree (D is the space dimensionality), our Multi-Body Gravitational Approach (MBGA) is robust to noise and missing data while supporting more massive point sets than previous methods (with 105 points and more). In various experimental settings, MBGA is shown to outperform several baseline point set alignment approaches in terms of accuracy and runtime. We make our source code available for the community to facilitate the reproducibility of the results1.1http://gvv.mpi-inf.mpg.de/projects/MBGA/","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132904503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Transformer-Based Network for Dynamic Hand Gesture Recognition 基于变压器的动态手势识别网络
2020 International Conference on 3D Vision (3DV) Pub Date : 2020-11-01 DOI: 10.1109/3DV50981.2020.00072
Andrea D'Eusanio, A. Simoni, S. Pini, G. Borghi, R. Vezzani, R. Cucchiara
{"title":"A Transformer-Based Network for Dynamic Hand Gesture Recognition","authors":"Andrea D'Eusanio, A. Simoni, S. Pini, G. Borghi, R. Vezzani, R. Cucchiara","doi":"10.1109/3DV50981.2020.00072","DOIUrl":"https://doi.org/10.1109/3DV50981.2020.00072","url":null,"abstract":"Transformer-based neural networks represent a successful self-attention mechanism that achieves state-of-the-art results in language understanding and sequence modeling. However, their application to visual data and, in particular, to the dynamic hand gesture recognition task has not yet been deeply investigated. In this paper, we propose a transformer-based architecture for the dynamic hand gesture recognition task. We show that the employment of a single active depth sensor, specifically the usage of depth maps and the surface normals estimated from them, achieves state-of-the-art results, overcoming all the methods available in the literature on two automotive datasets, namely NVidia Dynamic Hand Gesture and Briareo. Moreover, we test the method with other data types available with common RGB-D devices, such as infrared and color data. We also assess the performance in terms of inference time and number of parameters, showing that the proposed framework is suitable for an online in-car infotainment system.","PeriodicalId":293399,"journal":{"name":"2020 International Conference on 3D Vision (3DV)","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133279379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信
小红书