基于事件的高速机动视觉惯性状态估计

IF 10.5 1区计算机科学 Q1 ROBOTICS

IEEE Transactions on Robotics Pub Date : 2025-06-30 DOI:10.1109/TRO.2025.3584544

Xiuyuan Lu;Yi Zhou;Jiayao Mai;Kuan Dai;Yang Xu;Shaojie Shen

{"title":"基于事件的高速机动视觉惯性状态估计","authors":"Xiuyuan Lu;Yi Zhou;Jiayao Mai;Kuan Dai;Yang Xu;Shaojie Shen","doi":"10.1109/TRO.2025.3584544","DOIUrl":null,"url":null,"abstract":"Neuromorphic event-based cameras are bioinspired visual sensors with asynchronous pixels and extremely high temporal resolution. Such favorable properties make them an excellent choice for solving state estimation tasks under high-speed maneuvers. However, failures of camera pose tracking are frequently witnessed in state-of-the-art event-based visual odometry systems when the local map cannot be updated timely or feature matching is unreliable. One of the biggest roadblocks in this field is the absence of efficient and robust methods for data association without imposing any assumptions on the environment. This problem seems, however, unlikely to be addressed as in standard vision because of the motion-dependent nature of event data. To address this, we propose a map-free design for event-based visual-inertial state estimation in this article. Instead of estimating camera position, we find that recovering the instantaneous linear velocity aligns better with event cameras’ differential working principle. The proposed system uses raw data from a stereo event camera and an inertial measurement unit (IMU) as input, and adopts a dual-end architecture. The front-end preprocesses raw events and executes the computation of normal flow and depth information. To handle the temporally nonequispaced event data and establish association with temporally nonaligned IMU’s measurements, the back-end employs a continuous-time formulation and a sliding-window scheme that can progressively estimate the linear velocity and IMU’s bias. Experiments on synthetic and real data show our method achieves low-latency, metric-scale velocity estimation. To the best of the authors’ knowledge, this is the first real-time, purely event-based visual-inertial state estimator for high-speed maneuvers, requiring only sufficient textures and imposing no additional constraints on either the environment or motion pattern.","PeriodicalId":50388,"journal":{"name":"IEEE Transactions on Robotics","volume":"41 ","pages":"4439-4458"},"PeriodicalIF":10.5000,"publicationDate":"2025-06-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Event-Based Visual-Inertial State Estimation for High-Speed Maneuvers\",\"authors\":\"Xiuyuan Lu;Yi Zhou;Jiayao Mai;Kuan Dai;Yang Xu;Shaojie Shen\",\"doi\":\"10.1109/TRO.2025.3584544\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Neuromorphic event-based cameras are bioinspired visual sensors with asynchronous pixels and extremely high temporal resolution. Such favorable properties make them an excellent choice for solving state estimation tasks under high-speed maneuvers. However, failures of camera pose tracking are frequently witnessed in state-of-the-art event-based visual odometry systems when the local map cannot be updated timely or feature matching is unreliable. One of the biggest roadblocks in this field is the absence of efficient and robust methods for data association without imposing any assumptions on the environment. This problem seems, however, unlikely to be addressed as in standard vision because of the motion-dependent nature of event data. To address this, we propose a map-free design for event-based visual-inertial state estimation in this article. Instead of estimating camera position, we find that recovering the instantaneous linear velocity aligns better with event cameras’ differential working principle. The proposed system uses raw data from a stereo event camera and an inertial measurement unit (IMU) as input, and adopts a dual-end architecture. The front-end preprocesses raw events and executes the computation of normal flow and depth information. To handle the temporally nonequispaced event data and establish association with temporally nonaligned IMU’s measurements, the back-end employs a continuous-time formulation and a sliding-window scheme that can progressively estimate the linear velocity and IMU’s bias. Experiments on synthetic and real data show our method achieves low-latency, metric-scale velocity estimation. To the best of the authors’ knowledge, this is the first real-time, purely event-based visual-inertial state estimator for high-speed maneuvers, requiring only sufficient textures and imposing no additional constraints on either the environment or motion pattern.\",\"PeriodicalId\":50388,\"journal\":{\"name\":\"IEEE Transactions on Robotics\",\"volume\":\"41 \",\"pages\":\"4439-4458\"},\"PeriodicalIF\":10.5000,\"publicationDate\":\"2025-06-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11059886/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Robotics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11059886/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

基于神经形态事件的相机是具有异步像素和极高时间分辨率的仿生视觉传感器。这些良好的特性使它们成为解决高速机动下状态估计任务的绝佳选择。然而，在当前基于事件的视觉里程计系统中，由于局部地图不能及时更新或特征匹配不可靠，导致相机姿态跟踪失败。该领域最大的障碍之一是缺乏有效和健壮的数据关联方法，而不需要对环境施加任何假设。然而，由于事件数据的运动依赖性质，这个问题似乎不太可能像标准视觉那样得到解决。为了解决这个问题，我们在本文中提出了一种基于事件的视觉惯性状态估计的无地图设计。我们发现恢复瞬时线速度比估计相机位置更符合事件相机的差分工作原理。该系统采用立体事件相机和惯性测量单元（IMU）的原始数据作为输入，采用双端结构。前端对原始事件进行预处理，并执行正常流和深度信息的计算。为了处理时间非均匀的事件数据并与时间非对准的IMU测量建立关联，后端采用连续时间公式和滑动窗口方案，可以逐步估计线速度和IMU的偏差。在合成数据和实际数据上的实验表明，该方法可以实现低延迟、公制尺度的速度估计。据作者所知，这是第一个用于高速机动的实时、纯粹基于事件的视觉惯性状态估计器，只需要足够的纹理，并且对环境或运动模式没有额外的限制。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Event-Based Visual-Inertial State Estimation for High-Speed Maneuvers

Neuromorphic event-based cameras are bioinspired visual sensors with asynchronous pixels and extremely high temporal resolution. Such favorable properties make them an excellent choice for solving state estimation tasks under high-speed maneuvers. However, failures of camera pose tracking are frequently witnessed in state-of-the-art event-based visual odometry systems when the local map cannot be updated timely or feature matching is unreliable. One of the biggest roadblocks in this field is the absence of efficient and robust methods for data association without imposing any assumptions on the environment. This problem seems, however, unlikely to be addressed as in standard vision because of the motion-dependent nature of event data. To address this, we propose a map-free design for event-based visual-inertial state estimation in this article. Instead of estimating camera position, we find that recovering the instantaneous linear velocity aligns better with event cameras’ differential working principle. The proposed system uses raw data from a stereo event camera and an inertial measurement unit (IMU) as input, and adopts a dual-end architecture. The front-end preprocesses raw events and executes the computation of normal flow and depth information. To handle the temporally nonequispaced event data and establish association with temporally nonaligned IMU’s measurements, the back-end employs a continuous-time formulation and a sliding-window scheme that can progressively estimate the linear velocity and IMU’s bias. Experiments on synthetic and real data show our method achieves low-latency, metric-scale velocity estimation. To the best of the authors’ knowledge, this is the first real-time, purely event-based visual-inertial state estimator for high-speed maneuvers, requiring only sufficient textures and imposing no additional constraints on either the environment or motion pattern.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Robotics 工程技术-机器人学

CiteScore

14.90

自引率

5.10%

发文量

259

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Robotics (T-RO) is dedicated to publishing fundamental papers covering all facets of robotics, drawing on interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, and beyond. From industrial applications to service and personal assistants, surgical operations to space, underwater, and remote exploration, robots and intelligent machines play pivotal roles across various domains, including entertainment, safety, search and rescue, military applications, agriculture, and intelligent vehicles. Special emphasis is placed on intelligent machines and systems designed for unstructured environments, where a significant portion of the environment remains unknown and beyond direct sensing or control.