{"title":"EEF:能量分值引导下的RGB和热红外图像目标检测特征增强融合方法","authors":"Tianhao Hao , Jinfu Yang , Shaochen Zhang , Shuwen Wu","doi":"10.1016/j.sigpro.2025.110231","DOIUrl":null,"url":null,"abstract":"<div><div>The full exploitation of the complementarity between different modalities is crucial for RGB and Thermal infrared images (RGB-T) object detection. However, most existing methods utilizing a traditional backbone to extract features often struggle to enhance the discriminability of features from different modalities, thereby restricting the representational capacity of fused features. We propose an energy score-guided feature enhancement fusion method (EEF) for RGB-T object detection. Firstly, we design an energy-based feature enhancement module (EFEM) that leverages the proposed channel energy score to assess the importance and reliability of feature channels to enhance the discriminability of features and make them more focused on the region of the object. Then, we introduce an Efficient Cross-modal Fusion Module (ECFM) to capture complementary information between modalities by utilizing the global feature interaction capability of attention mechanisms. Finally, we incorporate an adaptive feedback module (AFM), which utilizes the fused features as guidance information to obtain the corresponding learning weights for different modalities to enhance the representational capacity of original features. We thoroughly evaluate our approach on the LLVIP and FLIR datasets, achieving preferable results of 64.9% and 41.1% mAP. The promising results adequately demonstrate the effectiveness of EEF in RGB-T object detection tasks.</div></div>","PeriodicalId":49523,"journal":{"name":"Signal Processing","volume":"239 ","pages":"Article 110231"},"PeriodicalIF":3.6000,"publicationDate":"2025-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EEF: Energy score-guided feature enhancement fusion method for RGB and thermal infrared images object detection\",\"authors\":\"Tianhao Hao , Jinfu Yang , Shaochen Zhang , Shuwen Wu\",\"doi\":\"10.1016/j.sigpro.2025.110231\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The full exploitation of the complementarity between different modalities is crucial for RGB and Thermal infrared images (RGB-T) object detection. However, most existing methods utilizing a traditional backbone to extract features often struggle to enhance the discriminability of features from different modalities, thereby restricting the representational capacity of fused features. We propose an energy score-guided feature enhancement fusion method (EEF) for RGB-T object detection. Firstly, we design an energy-based feature enhancement module (EFEM) that leverages the proposed channel energy score to assess the importance and reliability of feature channels to enhance the discriminability of features and make them more focused on the region of the object. Then, we introduce an Efficient Cross-modal Fusion Module (ECFM) to capture complementary information between modalities by utilizing the global feature interaction capability of attention mechanisms. Finally, we incorporate an adaptive feedback module (AFM), which utilizes the fused features as guidance information to obtain the corresponding learning weights for different modalities to enhance the representational capacity of original features. We thoroughly evaluate our approach on the LLVIP and FLIR datasets, achieving preferable results of 64.9% and 41.1% mAP. The promising results adequately demonstrate the effectiveness of EEF in RGB-T object detection tasks.</div></div>\",\"PeriodicalId\":49523,\"journal\":{\"name\":\"Signal Processing\",\"volume\":\"239 \",\"pages\":\"Article 110231\"},\"PeriodicalIF\":3.6000,\"publicationDate\":\"2025-08-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0165168425003457\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0165168425003457","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
EEF: Energy score-guided feature enhancement fusion method for RGB and thermal infrared images object detection
The full exploitation of the complementarity between different modalities is crucial for RGB and Thermal infrared images (RGB-T) object detection. However, most existing methods utilizing a traditional backbone to extract features often struggle to enhance the discriminability of features from different modalities, thereby restricting the representational capacity of fused features. We propose an energy score-guided feature enhancement fusion method (EEF) for RGB-T object detection. Firstly, we design an energy-based feature enhancement module (EFEM) that leverages the proposed channel energy score to assess the importance and reliability of feature channels to enhance the discriminability of features and make them more focused on the region of the object. Then, we introduce an Efficient Cross-modal Fusion Module (ECFM) to capture complementary information between modalities by utilizing the global feature interaction capability of attention mechanisms. Finally, we incorporate an adaptive feedback module (AFM), which utilizes the fused features as guidance information to obtain the corresponding learning weights for different modalities to enhance the representational capacity of original features. We thoroughly evaluate our approach on the LLVIP and FLIR datasets, achieving preferable results of 64.9% and 41.1% mAP. The promising results adequately demonstrate the effectiveness of EEF in RGB-T object detection tasks.
期刊介绍:
Signal Processing incorporates all aspects of the theory and practice of signal processing. It features original research work, tutorial and review articles, and accounts of practical developments. It is intended for a rapid dissemination of knowledge and experience to engineers and scientists working in the research, development or practical application of signal processing.
Subject areas covered by the journal include: Signal Theory; Stochastic Processes; Detection and Estimation; Spectral Analysis; Filtering; Signal Processing Systems; Software Developments; Image Processing; Pattern Recognition; Optical Signal Processing; Digital Signal Processing; Multi-dimensional Signal Processing; Communication Signal Processing; Biomedical Signal Processing; Geophysical and Astrophysical Signal Processing; Earth Resources Signal Processing; Acoustic and Vibration Signal Processing; Data Processing; Remote Sensing; Signal Processing Technology; Radar Signal Processing; Sonar Signal Processing; Industrial Applications; New Applications.