Beyond conventional vision: RGB-event fusion for robust object detection in dynamic traffic scenarios

IF 14.5 Q1 TRANSPORTATION
Zhanwen Liu , Yujing Sun , Yang Wang , Nan Yang , Shengbo Eben Li , Xiangmo Zhao
{"title":"Beyond conventional vision: RGB-event fusion for robust object detection in dynamic traffic scenarios","authors":"Zhanwen Liu ,&nbsp;Yujing Sun ,&nbsp;Yang Wang ,&nbsp;Nan Yang ,&nbsp;Shengbo Eben Li ,&nbsp;Xiangmo Zhao","doi":"10.1016/j.commtr.2025.100202","DOIUrl":null,"url":null,"abstract":"<div><div>The dynamic range limitation is intrinsic to conventional RGB cameras, which reduces global contrast and causes the loss of high-frequency details such as textures and edges in complex, dynamic traffic environments (e.g., nighttime driving or tunnel scenes). This deficiency hinders the extraction of discriminative features and degrades the performance of frame-based traffic object detection. To address this problem, we introduce a bio-inspired event camera integrated with an RGB camera to complement high dynamic range information, and propose a motion cue fusion network (MCFNet), an innovative fusion network that optimally achieves spatiotemporal alignment and develops an adaptive strategy for cross-modal feature fusion, to overcome performance degradation under challenging lighting conditions. Specifically, we design an event correction module (ECM) that temporally aligns asynchronous event streams with their corresponding image frames through optical-flow-based warping. The ECM is jointly optimized with the downstream object detection network to learn task-ware event representations. Subsequently, the event dynamic upsampling module (EDUM) enhances the spatial resolution of event frames to align its distribution with the structures of image pixels, achieving precise spatiotemporal alignment. Finally, the cross-modal mamba fusion module (CMM) employs adaptive feature fusion through a novel cross-modal interlaced scanning mechanism, effectively integrating complementary information for robust detection performance. Experiments conducted on the DSEC-Det and PKU-DAVIS-SOD datasets demonstrate that MCFNet significantly outperforms existing methods in various poor lighting and fast moving traffic scenarios. Notably, on the DSEC-Det dataset, MCFNet achieves a remarkable improvement, surpassing the best existing methods by 7.4% in mAP50 and 1.7% in mAP metrics, respectively.</div></div>","PeriodicalId":100292,"journal":{"name":"Communications in Transportation Research","volume":"5 ","pages":"Article 100202"},"PeriodicalIF":14.5000,"publicationDate":"2025-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications in Transportation Research","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772424725000423","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0

Abstract

The dynamic range limitation is intrinsic to conventional RGB cameras, which reduces global contrast and causes the loss of high-frequency details such as textures and edges in complex, dynamic traffic environments (e.g., nighttime driving or tunnel scenes). This deficiency hinders the extraction of discriminative features and degrades the performance of frame-based traffic object detection. To address this problem, we introduce a bio-inspired event camera integrated with an RGB camera to complement high dynamic range information, and propose a motion cue fusion network (MCFNet), an innovative fusion network that optimally achieves spatiotemporal alignment and develops an adaptive strategy for cross-modal feature fusion, to overcome performance degradation under challenging lighting conditions. Specifically, we design an event correction module (ECM) that temporally aligns asynchronous event streams with their corresponding image frames through optical-flow-based warping. The ECM is jointly optimized with the downstream object detection network to learn task-ware event representations. Subsequently, the event dynamic upsampling module (EDUM) enhances the spatial resolution of event frames to align its distribution with the structures of image pixels, achieving precise spatiotemporal alignment. Finally, the cross-modal mamba fusion module (CMM) employs adaptive feature fusion through a novel cross-modal interlaced scanning mechanism, effectively integrating complementary information for robust detection performance. Experiments conducted on the DSEC-Det and PKU-DAVIS-SOD datasets demonstrate that MCFNet significantly outperforms existing methods in various poor lighting and fast moving traffic scenarios. Notably, on the DSEC-Det dataset, MCFNet achieves a remarkable improvement, surpassing the best existing methods by 7.4% in mAP50 and 1.7% in mAP metrics, respectively.
超越传统视觉:动态交通场景中鲁棒目标检测的rgb -事件融合
动态范围限制是传统RGB相机固有的,它会降低全局对比度,并导致在复杂的动态交通环境(例如夜间驾驶或隧道场景)中丢失高频细节,如纹理和边缘。这一缺陷阻碍了判别特征的提取,降低了基于帧的流量目标检测的性能。为了解决这一问题,我们引入了一种与RGB相机集成的生物启发事件相机来补充高动态范围信息,并提出了一种运动线索融合网络(MCFNet),这是一种创新的融合网络,可以最佳地实现时空对齐,并开发了一种跨模态特征融合的自适应策略,以克服在具有挑战性的照明条件下的性能下降。具体来说,我们设计了一个事件校正模块(ECM),该模块通过基于光流的扭曲暂时将异步事件流与其相应的图像帧对齐。ECM与下游目标检测网络联合优化,学习任务件事件表示。随后,事件动态上采样模块(EDUM)提高事件帧的空间分辨率,使其分布与图像像素结构对齐,实现精确的时空对齐。最后,跨模态曼巴融合模块(CMM)通过一种新颖的跨模态交错扫描机制,采用自适应特征融合,有效整合互补信息,实现鲁棒检测性能。在DSEC-Det和PKU-DAVIS-SOD数据集上进行的实验表明,在各种光线不足和快速移动的交通场景下,MCFNet显著优于现有方法。值得注意的是,在DSEC-Det数据集上,MCFNet取得了显著的改进,在mAP50和mAP指标上分别比现有的最佳方法高出7.4%和1.7%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
15.20
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信