Wide-Baseline Relative Camera Pose Estimation with Directional Learning

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Pub Date : 2021-06-01 DOI:10.1109/CVPR46437.2021.00327

Kefan Chen, Noah Snavely, A. Makadia

{"title":"Wide-Baseline Relative Camera Pose Estimation with Directional Learning","authors":"Kefan Chen, Noah Snavely, A. Makadia","doi":"10.1109/CVPR46437.2021.00327","DOIUrl":null,"url":null,"abstract":"Modern deep learning techniques that regress the relative camera pose between two images have difficulty dealing with challenging scenarios, such as large camera motions resulting in occlusions and significant changes in perspective that leave little overlap between images. These models continue to struggle even with the benefit of large supervised training datasets. To address the limitations of these models, we take inspiration from techniques that show regressing keypoint locations in 2D and 3D can be improved by estimating a discrete distribution over keypoint locations. Analogously, in this paper we explore improving camera pose regression by instead predicting a discrete distribution over camera poses. To realize this idea, we introduce DirectionNet, which estimates discrete distributions over the 5D relative pose space using a novel parameterization to make the estimation problem tractable. Specifically, DirectionNet factorizes relative camera pose, specified by a 3D rotation and a translation direction, into a set of 3D direction vectors. Since 3D directions can be identified with points on the sphere, DirectionNet estimates discrete distributions on the sphere as its output. We evaluate our model on challenging synthetic and real pose estimation datasets constructed from Matterport3D and InteriorNet. Promising results show a near 50% reduction in error over direct regression methods.","PeriodicalId":339646,"journal":{"name":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CVPR46437.2021.00327","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

Abstract

Modern deep learning techniques that regress the relative camera pose between two images have difficulty dealing with challenging scenarios, such as large camera motions resulting in occlusions and significant changes in perspective that leave little overlap between images. These models continue to struggle even with the benefit of large supervised training datasets. To address the limitations of these models, we take inspiration from techniques that show regressing keypoint locations in 2D and 3D can be improved by estimating a discrete distribution over keypoint locations. Analogously, in this paper we explore improving camera pose regression by instead predicting a discrete distribution over camera poses. To realize this idea, we introduce DirectionNet, which estimates discrete distributions over the 5D relative pose space using a novel parameterization to make the estimation problem tractable. Specifically, DirectionNet factorizes relative camera pose, specified by a 3D rotation and a translation direction, into a set of 3D direction vectors. Since 3D directions can be identified with points on the sphere, DirectionNet estimates discrete distributions on the sphere as its output. We evaluate our model on challenging synthetic and real pose estimation datasets constructed from Matterport3D and InteriorNet. Promising results show a near 50% reduction in error over direct regression methods.

查看原文本刊更多论文

基于方向学习的宽基线相对相机姿态估计

现代深度学习技术对两幅图像之间的相对相机姿势进行回归，难以处理具有挑战性的场景，例如导致遮挡的大型相机运动和图像之间几乎没有重叠的重大视角变化。这些模型仍然在与大型监督训练数据集的优势作斗争。为了解决这些模型的局限性，我们从2D和3D中回归关键点位置的技术中获得灵感，这些技术可以通过估计关键点位置上的离散分布来改进。类似地，在本文中，我们探索通过预测相机姿势的离散分布来改进相机姿势回归。为了实现这一想法，我们引入了DirectionNet，它使用一种新的参数化方法来估计5D相对姿态空间上的离散分布，从而使估计问题易于处理。具体来说，DirectionNet将由3D旋转和平移方向指定的相对相机姿态分解为一组3D方向向量。由于三维方向可以通过球体上的点来识别，DirectionNet估计球体上的离散分布作为其输出。我们在Matterport3D和interornet构建的具有挑战性的合成和真实姿态估计数据集上评估了我们的模型。有希望的结果表明，与直接回归方法相比，误差减少了近50%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

自引率

0.00%

发文量