{"title":"Temporal Pyramid Alignment and Adaptive Fusion of Event Stream and Image Frame for Keypoint Detection and Tracking in Autonomous Driving","authors":"Peijun Shi, Chee-Onn Chow, Wei Ru Wong","doi":"10.1016/j.aej.2025.04.098","DOIUrl":null,"url":null,"abstract":"<div><div>This paper proposes a method to address the alignment and fusion challenges in multimodal fusion between event and RGB cameras. For multimodal alignment, we adopt the Temporal Pyramid Alignment mechanism to achieve multi-scale temporal synchronization of event streams and RGB frames. For multimodal fusion, we design a module that employs adaptive fusion to dynamically adjust the contribution of each modality based on scene complexity and feature quality. A gating network computes fusion weights by considering both relative modality importance and noise characteristics. A Cross-Modal Feature Compensation module is integrated into the framework to enhance information utilization. Additionally, the framework incorporates a Dynamic Inference Path Selection mechanism, guided by input complexity, to optimize computational resource allocation, along with a dynamic noise suppression mechanism to improve the robustness of feature extraction. Experimental results on the DSEC dataset demonstrate that the proposed method achieves a 36.9% mAP and 40.1% tracking success rate, particularly effective in extreme lighting and fast motion scenarios, surpassing existing approaches by 1.8% mAP and 1.6% SR, while maintaining real-time efficiency at 13.1 FPS. This work provides an important solution for applications in autonomous driving, robotics, and augmented reality, where robust multimodal perception under dynamic conditions is critical.</div></div>","PeriodicalId":7484,"journal":{"name":"alexandria engineering journal","volume":"127 ","pages":"Pages 228-238"},"PeriodicalIF":6.2000,"publicationDate":"2025-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"alexandria engineering journal","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1110016825005940","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0
Abstract
This paper proposes a method to address the alignment and fusion challenges in multimodal fusion between event and RGB cameras. For multimodal alignment, we adopt the Temporal Pyramid Alignment mechanism to achieve multi-scale temporal synchronization of event streams and RGB frames. For multimodal fusion, we design a module that employs adaptive fusion to dynamically adjust the contribution of each modality based on scene complexity and feature quality. A gating network computes fusion weights by considering both relative modality importance and noise characteristics. A Cross-Modal Feature Compensation module is integrated into the framework to enhance information utilization. Additionally, the framework incorporates a Dynamic Inference Path Selection mechanism, guided by input complexity, to optimize computational resource allocation, along with a dynamic noise suppression mechanism to improve the robustness of feature extraction. Experimental results on the DSEC dataset demonstrate that the proposed method achieves a 36.9% mAP and 40.1% tracking success rate, particularly effective in extreme lighting and fast motion scenarios, surpassing existing approaches by 1.8% mAP and 1.6% SR, while maintaining real-time efficiency at 13.1 FPS. This work provides an important solution for applications in autonomous driving, robotics, and augmented reality, where robust multimodal perception under dynamic conditions is critical.
期刊介绍:
Alexandria Engineering Journal is an international journal devoted to publishing high quality papers in the field of engineering and applied science. Alexandria Engineering Journal is cited in the Engineering Information Services (EIS) and the Chemical Abstracts (CA). The papers published in Alexandria Engineering Journal are grouped into five sections, according to the following classification:
• Mechanical, Production, Marine and Textile Engineering
• Electrical Engineering, Computer Science and Nuclear Engineering
• Civil and Architecture Engineering
• Chemical Engineering and Applied Sciences
• Environmental Engineering