基于红外和可见光图像融合的跨模态目标检测鲁棒目标识别

IF 4.9 3区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computers & Electrical Engineering Pub Date : 2025-02-11 DOI:10.1016/j.compeleceng.2025.110133

Hang Yu , Jichen Gao , Suiping Zhou, Chenyang Li, Jiaqi Shi, Feng Guo

{"title":"基于红外和可见光图像融合的跨模态目标检测鲁棒目标识别","authors":"Hang Yu , Jichen Gao , Suiping Zhou, Chenyang Li, Jiaqi Shi, Feng Guo","doi":"10.1016/j.compeleceng.2025.110133","DOIUrl":null,"url":null,"abstract":"<div><div>Visible-infrared cross-modal object detection aims to overcome the limitations of single modality highlighting in complex environments (rain, fog, weak light) by utilizing dual-modal images. Most existing methods typically use finite size convolution kernels to learn local features, and ignore the interaction of non-local feature dependencies between modalities such as the infrared and the visible modalities, resulting in unsatisfactory detection performance. To tackle the problem, we propose a multi-modal object detection algorithm that fuse visible and infrared modalities through cross enhancement and long-range guidance, effectively combining complementary information and shared collaborative information to enhance detection capabilities. In this paper, we first propose the cross-modality feature enhancement method that utilizes the difference between channel information and spatial information of each modality. Secondly, we use cross-attention layers on the basis of transformer to achieve long-range interactive information exchange, and add self-attention layers to enhance internal connections. Finally, we propose a feature enhancement module that enhances performance by utilizing a multi-branch structure composed of different convolutions. Experiments on three publicly available datasets have shown that our proposed approach achieves superior robustness and accuracy under all weather conditions and constantly changing lighting conditions.</div></div>","PeriodicalId":50630,"journal":{"name":"Computers & Electrical Engineering","volume":"123 ","pages":"Article 110133"},"PeriodicalIF":4.9000,"publicationDate":"2025-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cross-Modality Target Detection Using Infrared and Visible Image Fusion for Robust Objection recognition\",\"authors\":\"Hang Yu , Jichen Gao , Suiping Zhou, Chenyang Li, Jiaqi Shi, Feng Guo\",\"doi\":\"10.1016/j.compeleceng.2025.110133\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Visible-infrared cross-modal object detection aims to overcome the limitations of single modality highlighting in complex environments (rain, fog, weak light) by utilizing dual-modal images. Most existing methods typically use finite size convolution kernels to learn local features, and ignore the interaction of non-local feature dependencies between modalities such as the infrared and the visible modalities, resulting in unsatisfactory detection performance. To tackle the problem, we propose a multi-modal object detection algorithm that fuse visible and infrared modalities through cross enhancement and long-range guidance, effectively combining complementary information and shared collaborative information to enhance detection capabilities. In this paper, we first propose the cross-modality feature enhancement method that utilizes the difference between channel information and spatial information of each modality. Secondly, we use cross-attention layers on the basis of transformer to achieve long-range interactive information exchange, and add self-attention layers to enhance internal connections. Finally, we propose a feature enhancement module that enhances performance by utilizing a multi-branch structure composed of different convolutions. Experiments on three publicly available datasets have shown that our proposed approach achieves superior robustness and accuracy under all weather conditions and constantly changing lighting conditions.</div></div>\",\"PeriodicalId\":50630,\"journal\":{\"name\":\"Computers & Electrical Engineering\",\"volume\":\"123 \",\"pages\":\"Article 110133\"},\"PeriodicalIF\":4.9000,\"publicationDate\":\"2025-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Electrical Engineering\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S004579062500076X\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Electrical Engineering","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S004579062500076X","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

摘要

可见-红外跨模态目标检测旨在利用双模态图像，克服在复杂环境（雨、雾、弱光）下单模态高亮的局限性。大多数现有方法通常使用有限大小的卷积核来学习局部特征，而忽略了红外和可见光模态等模态之间非局部特征依赖的相互作用，导致检测性能不理想。针对这一问题，本文提出了一种多模态目标检测算法，通过交叉增强和远程制导融合可见光和红外模态，将互补信息和共享协同信息有效结合，增强检测能力。本文首先提出了利用信道信息和各模态空间信息差异的跨模态特征增强方法。其次，在变压器的基础上使用交叉关注层实现远程交互信息交换，并增加自关注层增强内部联系。最后，我们提出了一个特征增强模块，通过利用由不同卷积组成的多分支结构来提高性能。在三个公开可用的数据集上的实验表明，我们提出的方法在所有天气条件和不断变化的光照条件下都具有出色的鲁棒性和准确性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cross-Modality Target Detection Using Infrared and Visible Image Fusion for Robust Objection recognition

Visible-infrared cross-modal object detection aims to overcome the limitations of single modality highlighting in complex environments (rain, fog, weak light) by utilizing dual-modal images. Most existing methods typically use finite size convolution kernels to learn local features, and ignore the interaction of non-local feature dependencies between modalities such as the infrared and the visible modalities, resulting in unsatisfactory detection performance. To tackle the problem, we propose a multi-modal object detection algorithm that fuse visible and infrared modalities through cross enhancement and long-range guidance, effectively combining complementary information and shared collaborative information to enhance detection capabilities. In this paper, we first propose the cross-modality feature enhancement method that utilizes the difference between channel information and spatial information of each modality. Secondly, we use cross-attention layers on the basis of transformer to achieve long-range interactive information exchange, and add self-attention layers to enhance internal connections. Finally, we propose a feature enhancement module that enhances performance by utilizing a multi-branch structure composed of different convolutions. Experiments on three publicly available datasets have shown that our proposed approach achieves superior robustness and accuracy under all weather conditions and constantly changing lighting conditions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Computers & Electrical Engineering 工程技术-工程：电子与电气

CiteScore

9.20

自引率

7.00%

发文量

661

审稿时长

47 days

期刊介绍： The impact of computers has nowhere been more revolutionary than in electrical engineering. The design, analysis, and operation of electrical and electronic systems are now dominated by computers, a transformation that has been motivated by the natural ease of interface between computers and electrical systems, and the promise of spectacular improvements in speed and efficiency. Published since 1973, Computers & Electrical Engineering provides rapid publication of topical research into the integration of computer technology and computational techniques with electrical and electronic systems. The journal publishes papers featuring novel implementations of computers and computational techniques in areas like signal and image processing, high-performance computing, parallel processing, and communications. Special attention will be paid to papers describing innovative architectures, algorithms, and software tools.