Calibration-free Visual-Inertial Fusion with Deep Convolutional Recurrent Neural Networks

Proceedings of the 32nd International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2019) Pub Date : 2019-10-11 DOI:10.33012/2019.16918

S. Sheikhpour, M. Atia

{"title":"Calibration-free Visual-Inertial Fusion with Deep Convolutional Recurrent Neural Networks","authors":"S. Sheikhpour, M. Atia","doi":"10.33012/2019.16918","DOIUrl":null,"url":null,"abstract":"Visual-Inertial Odometry (VIO) has been one of the most popular yet affordable navigation systems for indoor and even outdoor applications. VIO can augment or replace Global Navigation Satellite Systems (GNSSs) under signal degradation or service interruptions. Conventionally, the fusion of visual and inertial modalities has been performed using optimization-based or filtering-based techniques such as nonlinear Least Squares (LS) or Extended Kalman Filter (EKF). These classic techniques, despite several simplifying approximations, involve sophisticated modelling and parameterization of the navigation problem, which necessitates expert fine-tuning of the navigation system. In this work, a calibration-free visual-inertial fusion technique using Deep Convolutional Recurrent Neural Networks (DCRNN) is proposed. The network employs a Convolutional Neural Network (CNN) to process the spatial information embedded in visual data and two Recurrent Neural Networks (RNNs) to process the inertial sensor measurements and the CNN output for final pose estimation. The network is trained with raw Inertial Measurement Unit (IMU) data and monocular camera frames as its inputs, and the relative pose as its output. Unlike the conventional VIO techniques, there is no need for IMU biases and scale factors, intrinsic and extrinsic parameters of the camera to be explicitly provided or modelled in the proposed navigation system, rather these parameters along with system dynamics are implicitly learned during the training phase. Moreover, since the inertial and visual data are fused at mid-layers in the network, deeper correlations of these two modalities are learned compared to a simple combination of the final pose estimates of both modalities at the output layers, hence, the fusion can be considered as a tightly-coupled fusion of visual and inertial modalities. The proposed VIO network is evaluated on real datasets and thorough discussion is provided on the capabilities of the deep learning approach toward VIO.","PeriodicalId":381025,"journal":{"name":"Proceedings of the 32nd International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2019)","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 32nd International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2019)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33012/2019.16918","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Visual-Inertial Odometry (VIO) has been one of the most popular yet affordable navigation systems for indoor and even outdoor applications. VIO can augment or replace Global Navigation Satellite Systems (GNSSs) under signal degradation or service interruptions. Conventionally, the fusion of visual and inertial modalities has been performed using optimization-based or filtering-based techniques such as nonlinear Least Squares (LS) or Extended Kalman Filter (EKF). These classic techniques, despite several simplifying approximations, involve sophisticated modelling and parameterization of the navigation problem, which necessitates expert fine-tuning of the navigation system. In this work, a calibration-free visual-inertial fusion technique using Deep Convolutional Recurrent Neural Networks (DCRNN) is proposed. The network employs a Convolutional Neural Network (CNN) to process the spatial information embedded in visual data and two Recurrent Neural Networks (RNNs) to process the inertial sensor measurements and the CNN output for final pose estimation. The network is trained with raw Inertial Measurement Unit (IMU) data and monocular camera frames as its inputs, and the relative pose as its output. Unlike the conventional VIO techniques, there is no need for IMU biases and scale factors, intrinsic and extrinsic parameters of the camera to be explicitly provided or modelled in the proposed navigation system, rather these parameters along with system dynamics are implicitly learned during the training phase. Moreover, since the inertial and visual data are fused at mid-layers in the network, deeper correlations of these two modalities are learned compared to a simple combination of the final pose estimates of both modalities at the output layers, hence, the fusion can be considered as a tightly-coupled fusion of visual and inertial modalities. The proposed VIO network is evaluated on real datasets and thorough discussion is provided on the capabilities of the deep learning approach toward VIO.

查看原文本刊更多论文

基于深度卷积递归神经网络的无标定视觉惯性融合

视觉惯性里程计(VIO)已经成为室内甚至室外应用中最受欢迎且价格实惠的导航系统之一。VIO可以在信号退化或服务中断的情况下增强或取代全球导航卫星系统(gnss)。传统上，视觉和惯性模态的融合使用基于优化或基于滤波的技术，如非线性最小二乘(LS)或扩展卡尔曼滤波(EKF)。这些经典的技术，尽管有一些简化的近似，涉及复杂的建模和参数化的导航问题，这需要专家微调导航系统。在这项工作中，提出了一种使用深度卷积递归神经网络(DCRNN)的无校准视觉惯性融合技术。该网络采用卷积神经网络(CNN)处理嵌入在视觉数据中的空间信息，两个递归神经网络(rnn)处理惯性传感器测量和CNN输出以进行最终姿态估计。该网络以原始惯性测量单元(IMU)数据和单目摄像机帧作为输入，相对姿态作为输出。与传统的VIO技术不同，所提出的导航系统不需要明确提供或建模IMU偏差和尺度因子、相机的内在和外在参数，而是在训练阶段隐式学习这些参数以及系统动力学。此外，由于惯性和视觉数据在网络的中间层融合，与输出层两种模态的最终姿态估计的简单组合相比，这两种模态的更深层次的相关性被学习到，因此，融合可以被认为是视觉和惯性模态的紧密耦合融合。在实际数据集上对所提出的VIO网络进行了评估，并对VIO的深度学习方法的能力进行了深入的讨论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 32nd International Technical Meeting of the Satellite Division of The Institute of Navigation (ION GNSS+ 2019)

自引率

0.00%

发文量