Shangru Yang;Yudong Liu;Shenwei Qu;Rizhi Dong;Boo Cheong Khoo;Sutthiphong Srigrarom;Qingjun Yang
{"title":"MCNN-VIO: A High-Accuracy Multi-Camera Visual-Inertial Odometry With Neural Networks","authors":"Shangru Yang;Yudong Liu;Shenwei Qu;Rizhi Dong;Boo Cheong Khoo;Sutthiphong Srigrarom;Qingjun Yang","doi":"10.1109/TIM.2025.3557101","DOIUrl":null,"url":null,"abstract":"Visual-inertial odometry (VIO) has the advantages of small size and low hardware cost as one of the methods for state estimation. Especially, the accuracy of the filtering-based system is high and its computational load is low. The multiple cameras configured in different directions are always applied to expand the observation scope so that the poor robustness of single-camera estimation and the abilities of object perception will be improved. However, multiple cameras will track more features, which not only increases calculation but also adds many features with large errors into programs, leading the low accuracy. In this article, a new framework, multi-camera with neural networks VIO (MCNN-VIO), based on multistate constraint Kalman filter (MSCKF) is proposed. It fuses two stereo cameras with non-overlapping fields of view (FoV) and an inertial measurement unit (IMU). The feature processing strategies between different cameras have been redesigned. The method is capable of using neural networks to intelligently select features in a variety of complex environments. Besides, a novel strategy of feature selection is proposed to obtain the closest poses to the true value for network training. This strategy can find the optimal solution in a limited number of stochastic and inclined combinations. The method was tested in scenes with both rich features and challenging darkness. The experimental results show that the method exhibits higher accuracy and better robustness compared to the multi-camera configuration of the conventional algorithm. Meanwhile, it maintains a competitive performance and low calculation cost compared to a single-camera version.","PeriodicalId":13341,"journal":{"name":"IEEE Transactions on Instrumentation and Measurement","volume":"74 ","pages":"1-11"},"PeriodicalIF":5.6000,"publicationDate":"2025-04-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Instrumentation and Measurement","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10947565/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Visual-inertial odometry (VIO) has the advantages of small size and low hardware cost as one of the methods for state estimation. Especially, the accuracy of the filtering-based system is high and its computational load is low. The multiple cameras configured in different directions are always applied to expand the observation scope so that the poor robustness of single-camera estimation and the abilities of object perception will be improved. However, multiple cameras will track more features, which not only increases calculation but also adds many features with large errors into programs, leading the low accuracy. In this article, a new framework, multi-camera with neural networks VIO (MCNN-VIO), based on multistate constraint Kalman filter (MSCKF) is proposed. It fuses two stereo cameras with non-overlapping fields of view (FoV) and an inertial measurement unit (IMU). The feature processing strategies between different cameras have been redesigned. The method is capable of using neural networks to intelligently select features in a variety of complex environments. Besides, a novel strategy of feature selection is proposed to obtain the closest poses to the true value for network training. This strategy can find the optimal solution in a limited number of stochastic and inclined combinations. The method was tested in scenes with both rich features and challenging darkness. The experimental results show that the method exhibits higher accuracy and better robustness compared to the multi-camera configuration of the conventional algorithm. Meanwhile, it maintains a competitive performance and low calculation cost compared to a single-camera version.
期刊介绍:
Papers are sought that address innovative solutions to the development and use of electrical and electronic instruments and equipment to measure, monitor and/or record physical phenomena for the purpose of advancing measurement science, methods, functionality and applications. The scope of these papers may encompass: (1) theory, methodology, and practice of measurement; (2) design, development and evaluation of instrumentation and measurement systems and components used in generating, acquiring, conditioning and processing signals; (3) analysis, representation, display, and preservation of the information obtained from a set of measurements; and (4) scientific and technical support to establishment and maintenance of technical standards in the field of Instrumentation and Measurement.