GFENet：通过融合事件和图像进行转向角预测的组智能特征增强网络

IF 3.5 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Applied Intelligence Pub Date : 2024-12-21 DOI:10.1007/s10489-024-06019-3

Duo-Wen Chen, Chi Guo, Jian-Lang Hu

{"title":"GFENet：通过融合事件和图像进行转向角预测的组智能特征增强网络","authors":"Duo-Wen Chen, Chi Guo, Jian-Lang Hu","doi":"10.1007/s10489-024-06019-3","DOIUrl":null,"url":null,"abstract":"<div><p>Existing end-to-end networks for steering angle prediction usually use images generated by standard cameras as input. However, standard cameras are susceptible to poor lighting conditions and motion blur, which is not conducive to training an accurate and robust end-to-end network. In contrast, biological vision-inspired event cameras overcome the aforementioned shortcomings with their unique working principle and offer significant advantages such as high temporal resolution, high dynamic range and low power consumption. Nevertheless, event cameras generate a lot of noise and are unable to provide texture information on static region. Therefore, these two types of cameras are complementary to each other to some extent. To explore the benefits of fusing information from these two types of cameras in autonomous driving tasks, we propose GFENet, an attention-based two-stream encoder-decoder architecture for steering angle prediction by combining events and images. Firstly, asynchronous and sparse events are converted into synchronous and dense event frames. Then, event frames and corresponding image frames are fed into two symmetric encoders to extract features. Next, We introduce a Group-Wise Feature-Enhanced (GEF) module that can refine features and suppress noise to guide the fusion of two modalities features at different levels. Finally, The final fused features are passed through a simple decoder to predict the steering angle. Experiments results on the DDD20 and EventScape datasets shows that our GFEFNet outperforms the state-of-the-art image-event fusion method.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"55 3","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-024-06019-3.pdf","citationCount":"0","resultStr":"{\"title\":\"GFENet: group-wise feature-enhanced network for steering angle prediction by fusing events and images\",\"authors\":\"Duo-Wen Chen, Chi Guo, Jian-Lang Hu\",\"doi\":\"10.1007/s10489-024-06019-3\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Existing end-to-end networks for steering angle prediction usually use images generated by standard cameras as input. However, standard cameras are susceptible to poor lighting conditions and motion blur, which is not conducive to training an accurate and robust end-to-end network. In contrast, biological vision-inspired event cameras overcome the aforementioned shortcomings with their unique working principle and offer significant advantages such as high temporal resolution, high dynamic range and low power consumption. Nevertheless, event cameras generate a lot of noise and are unable to provide texture information on static region. Therefore, these two types of cameras are complementary to each other to some extent. To explore the benefits of fusing information from these two types of cameras in autonomous driving tasks, we propose GFENet, an attention-based two-stream encoder-decoder architecture for steering angle prediction by combining events and images. Firstly, asynchronous and sparse events are converted into synchronous and dense event frames. Then, event frames and corresponding image frames are fed into two symmetric encoders to extract features. Next, We introduce a Group-Wise Feature-Enhanced (GEF) module that can refine features and suppress noise to guide the fusion of two modalities features at different levels. Finally, The final fused features are passed through a simple decoder to predict the steering angle. Experiments results on the DDD20 and EventScape datasets shows that our GFEFNet outperforms the state-of-the-art image-event fusion method.</p></div>\",\"PeriodicalId\":8041,\"journal\":{\"name\":\"Applied Intelligence\",\"volume\":\"55 3\",\"pages\":\"\"},\"PeriodicalIF\":3.5000,\"publicationDate\":\"2024-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://link.springer.com/content/pdf/10.1007/s10489-024-06019-3.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10489-024-06019-3\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Intelligence","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10489-024-06019-3","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

现有的用于转向角预测的端到端网络通常使用标准摄像头生成的图像作为输入。然而，标准相机容易受到光线条件差和运动模糊的影响，不利于训练精确、稳健的端到端网络。相比之下，受生物视觉启发的事件相机以其独特的工作原理克服了上述缺点，并具有高时间分辨率、高动态范围和低功耗等显著优势。然而，事件相机会产生大量噪声，而且无法提供静态区域的纹理信息。因此，这两类相机在一定程度上是互补的。为了探索在自动驾驶任务中融合这两类摄像头信息的优势，我们提出了基于注意力的双流编码器-解码器架构 GFENet，通过结合事件和图像进行转向角预测。首先，将异步稀疏事件转换为同步密集事件帧。然后，将事件帧和相应的图像帧输入两个对称编码器以提取特征。接下来，我们引入了一个群智特征增强（GEF）模块，该模块可以细化特征并抑制噪声，从而在不同层次上指导两种模态特征的融合。最后，最终融合的特征通过一个简单的解码器来预测转向角。在 DDD20 和 EventScape 数据集上的实验结果表明，我们的 GFEFNet 优于最先进的图像-事件融合方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GFENet: group-wise feature-enhanced network for steering angle prediction by fusing events and images

Existing end-to-end networks for steering angle prediction usually use images generated by standard cameras as input. However, standard cameras are susceptible to poor lighting conditions and motion blur, which is not conducive to training an accurate and robust end-to-end network. In contrast, biological vision-inspired event cameras overcome the aforementioned shortcomings with their unique working principle and offer significant advantages such as high temporal resolution, high dynamic range and low power consumption. Nevertheless, event cameras generate a lot of noise and are unable to provide texture information on static region. Therefore, these two types of cameras are complementary to each other to some extent. To explore the benefits of fusing information from these two types of cameras in autonomous driving tasks, we propose GFENet, an attention-based two-stream encoder-decoder architecture for steering angle prediction by combining events and images. Firstly, asynchronous and sparse events are converted into synchronous and dense event frames. Then, event frames and corresponding image frames are fed into two symmetric encoders to extract features. Next, We introduce a Group-Wise Feature-Enhanced (GEF) module that can refine features and suppress noise to guide the fusion of two modalities features at different levels. Finally, The final fused features are passed through a simple decoder to predict the steering angle. Experiments results on the DDD20 and EventScape datasets shows that our GFEFNet outperforms the state-of-the-art image-event fusion method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Intelligence 工程技术-计算机：人工智能

CiteScore

6.60

自引率

20.80%

发文量

1361

审稿时长

5.9 months

期刊介绍： With a focus on research in artificial intelligence and neural networks, this journal addresses issues involving solutions of real-life manufacturing, defense, management, government and industrial problems which are too complex to be solved through conventional approaches and require the simulation of intelligent thought processes, heuristics, applications of knowledge, and distributed and parallel processing. The integration of these multiple approaches in solving complex problems is of particular importance. The journal presents new and original research and technological developments, addressing real and complex issues applicable to difficult problems. It provides a medium for exchanging scientific research and technological achievements accomplished by the international community.