{"title":"注意引导视觉惯性里程计","authors":"Li Liu, Ge Li, Thomas H. Li","doi":"10.1109/ICASSP39728.2021.9413912","DOIUrl":null,"url":null,"abstract":"Visual-inertial odometry (VIO) aims to predict trajectory by ego- motion estimation. In recent years, end-to-end VIO has made great progress. However, how to handle visual and inertial measurements and make full use of the complementarity of cameras and inertial sensors remains a challenge. In the paper, we propose a novel attention guided deep framework for visual-inertial odometry (ATVIO) to improve the performance of VIO. Specifically, we extraordinarily concentrate on the effective utilization of the Inertial Measurement Unit (IMU) information. Therefore, we carefully design a one-dimension inertial feature encoder for IMU data processing. The network can extract inertial features quickly and effectively. Meanwhile, we should prevent the inconsistency problem when fusing inertial and visual features. Hence, we explore a novel cross-domain channel attention block to combine the extracted features in a more adaptive manner. Extensive experiments demonstrate that our method achieves competitive performance against state-of-the-art VIO methods.","PeriodicalId":347060,"journal":{"name":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"ATVIO: Attention Guided Visual-Inertial Odometry\",\"authors\":\"Li Liu, Ge Li, Thomas H. Li\",\"doi\":\"10.1109/ICASSP39728.2021.9413912\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Visual-inertial odometry (VIO) aims to predict trajectory by ego- motion estimation. In recent years, end-to-end VIO has made great progress. However, how to handle visual and inertial measurements and make full use of the complementarity of cameras and inertial sensors remains a challenge. In the paper, we propose a novel attention guided deep framework for visual-inertial odometry (ATVIO) to improve the performance of VIO. Specifically, we extraordinarily concentrate on the effective utilization of the Inertial Measurement Unit (IMU) information. Therefore, we carefully design a one-dimension inertial feature encoder for IMU data processing. The network can extract inertial features quickly and effectively. Meanwhile, we should prevent the inconsistency problem when fusing inertial and visual features. Hence, we explore a novel cross-domain channel attention block to combine the extracted features in a more adaptive manner. Extensive experiments demonstrate that our method achieves competitive performance against state-of-the-art VIO methods.\",\"PeriodicalId\":347060,\"journal\":{\"name\":\"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-06-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASSP39728.2021.9413912\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASSP39728.2021.9413912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Visual-inertial odometry (VIO) aims to predict trajectory by ego- motion estimation. In recent years, end-to-end VIO has made great progress. However, how to handle visual and inertial measurements and make full use of the complementarity of cameras and inertial sensors remains a challenge. In the paper, we propose a novel attention guided deep framework for visual-inertial odometry (ATVIO) to improve the performance of VIO. Specifically, we extraordinarily concentrate on the effective utilization of the Inertial Measurement Unit (IMU) information. Therefore, we carefully design a one-dimension inertial feature encoder for IMU data processing. The network can extract inertial features quickly and effectively. Meanwhile, we should prevent the inconsistency problem when fusing inertial and visual features. Hence, we explore a novel cross-domain channel attention block to combine the extracted features in a more adaptive manner. Extensive experiments demonstrate that our method achieves competitive performance against state-of-the-art VIO methods.