Learning-based monocular visual-inertial odometry with S E 2 ( 3 ) $S{E}_{2}(3)$ -EKF

IF 4.2 2区计算机科学 Q2 ROBOTICS

Journal of Field Robotics Pub Date : 2024-04-24 DOI:10.1002/rob.22349

Chi Guo, Jianlang Hu, Yarong Luo

{"title":"Learning-based monocular visual-inertial odometry with \n \n \n \n S\n \n E\n 2\n \n \n (\n 3\n )\n \n \n \n $S{E}_{2}(3)$\n -EKF","authors":"Chi Guo, Jianlang Hu, Yarong Luo","doi":"10.1002/rob.22349","DOIUrl":null,"url":null,"abstract":"Learning-based visual odometry (VO) becomes popular as it achieves a remarkable performance without manually crafted image processing and burdensome calibration. Meanwhile, the inertial navigation can provide a localization solution to assist VO when the VO produces poor state estimation under challenging visual conditions. Therefore, the combination of learning-based technique and classical state estimation method can further improve the performance of pose estimation. In this paper, we propose a learning-based visual-inertial odometry (VIO) algorithm, which consists of an end-to-end VO network and an <math>\n <semantics>\n <mrow>\n \n <mrow>\n <mi>S</mi>\n \n <msub>\n <mi>E</mi>\n \n <mn>2</mn>\n </msub>\n \n <mrow>\n <mo>(</mo>\n \n <mn>3</mn>\n \n <mo>)</mo>\n </mrow>\n </mrow>\n </mrow>\n <annotation> $S{E}_{2}(3)$</annotation>\n </semantics></math>-Extended Kalman Filter (EKF). The VO network mainly combines a convolutional neural network with a recurrent neural network, taking advantage of two consecutive monocular images to produce relative pose estimation with associated uncertainties. The <math>\n <semantics>\n <mrow>\n \n <mrow>\n <mi>S</mi>\n \n <msub>\n <mi>E</mi>\n \n <mn>2</mn>\n </msub>\n \n <mrow>\n <mo>(</mo>\n \n <mn>3</mn>\n \n <mo>)</mo>\n </mrow>\n </mrow>\n </mrow>\n <annotation> $S{E}_{2}(3)$</annotation>\n </semantics></math>-EKF, which is proved to overcome the inconsistency issues of VIO, propagates inertial measurement unit kinematics-based states, and fuses relative measurements and uncertainties from the VO network in its update step. The extensive experimental results on the KITTI data set and the EuRoC data set demonstrate the superior performance of the proposed method compared to other related methods.","PeriodicalId":192,"journal":{"name":"Journal of Field Robotics","volume":"41 6","pages":"1780-1796"},"PeriodicalIF":4.2000,"publicationDate":"2024-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Field Robotics","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/rob.22349","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Learning-based visual odometry (VO) becomes popular as it achieves a remarkable performance without manually crafted image processing and burdensome calibration. Meanwhile, the inertial navigation can provide a localization solution to assist VO when the VO produces poor state estimation under challenging visual conditions. Therefore, the combination of learning-based technique and classical state estimation method can further improve the performance of pose estimation. In this paper, we propose a learning-based visual-inertial odometry (VIO) algorithm, which consists of an end-to-end VO network and an $S E_{2} (3)$ -Extended Kalman Filter (EKF). The VO network mainly combines a convolutional neural network with a recurrent neural network, taking advantage of two consecutive monocular images to produce relative pose estimation with associated uncertainties. The $S E_{2} (3)$ -EKF, which is proved to overcome the inconsistency issues of VIO, propagates inertial measurement unit kinematics-based states, and fuses relative measurements and uncertainties from the VO network in its update step. The extensive experimental results on the KITTI data set and the EuRoC data set demonstrate the superior performance of the proposed method compared to other related methods.

查看原文本刊更多论文

使用 SE2(3) $S{E}_{2}(3)$-EKF 进行基于学习的单目视觉惯性里程测量

基于学习的视觉里程测量（VO）无需人工图像处理和繁琐的校准就能实现出色的性能，因此广受欢迎。同时，惯性导航可以提供一种定位解决方案，当视觉里程计在具有挑战性的视觉条件下产生较差的状态估计时，惯性导航可以辅助视觉里程计。因此，将基于学习的技术与经典的状态估计方法相结合，可以进一步提高姿态估计的性能。本文提出了一种基于学习的视觉惯性里程测量（VIO）算法，它由端到端 VO 网络和扩展卡尔曼滤波器（EKF）组成。VO 网络主要结合了卷积神经网络和递归神经网络，利用两幅连续的单目图像来产生带有相关不确定性的相对姿态估计。事实证明，EKF 克服了 VIO 的不一致性问题，它传播基于惯性测量单元运动学的状态，并在更新步骤中融合来自 VO 网络的相对测量和不确定性。在 KITTI 数据集和 EuRoC 数据集上的大量实验结果表明，与其他相关方法相比，所提出的方法具有更优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Field Robotics 工程技术-机器人学

CiteScore

15.00

自引率

3.60%

发文量

审稿时长

6 months

期刊介绍： The Journal of Field Robotics seeks to promote scholarly publications dealing with the fundamentals of robotics in unstructured and dynamic environments. The Journal focuses on experimental robotics and encourages publication of work that has both theoretical and practical significance.