RGB-T对象检测失败场景

IF 4.7 2区 地球科学 Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
Qingwang Wang;Yuxuan Sun;Yongke Chi;Tao Shen
{"title":"RGB-T对象检测失败场景","authors":"Qingwang Wang;Yuxuan Sun;Yongke Chi;Tao Shen","doi":"10.1109/JSTARS.2024.3523408","DOIUrl":null,"url":null,"abstract":"Currently, RGB-thermal (RGB-T) object detection algorithms have demonstrated excellent performance, but issues such as modality failure caused by fog, strong light, sensor damage, and other conditions can significantly impact the detector's performance. This article proposes a multimodal object detection method named diffusion enhanced object detection network (DENet), aiming to address modality failure problems caused by nonroutine environments, sensor anomalies, and other factors, while suppressing redundant information in multimodal data to improve model accuracy. First, we design a multidimensional incremental information generation module based on a diffusion model, which reconstructs the unstable information of RGB-T images through the reverse diffusion process using the original fusion feature map. To further address the issue of redundant information in existing RGB-T object detection models, a redundant information suppression module is introduced, minimizing cross-modal redundant information based on mutual information and contrastive loss. Finally, a kernel similarity-aware illumination module (KSIM) is introduced to dynamically adjust the weighting of RGB and thermal features by incorporating both illumination intensity and the similarity between modalities. KSIM can fine-tune the contribution of each modality during fusion, ensuring a more precise balance that improves recognition performance across diverse conditions. Experimental results on the DroneVehicle and VEDAI datasets show that DENet performs outstandingly in multimodal object detection tasks, effectively improving detection accuracy and reducing the impact of modality failure on performance.","PeriodicalId":13116,"journal":{"name":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","volume":"18 ","pages":"3000-3010"},"PeriodicalIF":4.7000,"publicationDate":"2024-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10817087","citationCount":"0","resultStr":"{\"title\":\"RGB-T Object Detection With Failure Scenarios\",\"authors\":\"Qingwang Wang;Yuxuan Sun;Yongke Chi;Tao Shen\",\"doi\":\"10.1109/JSTARS.2024.3523408\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Currently, RGB-thermal (RGB-T) object detection algorithms have demonstrated excellent performance, but issues such as modality failure caused by fog, strong light, sensor damage, and other conditions can significantly impact the detector's performance. This article proposes a multimodal object detection method named diffusion enhanced object detection network (DENet), aiming to address modality failure problems caused by nonroutine environments, sensor anomalies, and other factors, while suppressing redundant information in multimodal data to improve model accuracy. First, we design a multidimensional incremental information generation module based on a diffusion model, which reconstructs the unstable information of RGB-T images through the reverse diffusion process using the original fusion feature map. To further address the issue of redundant information in existing RGB-T object detection models, a redundant information suppression module is introduced, minimizing cross-modal redundant information based on mutual information and contrastive loss. Finally, a kernel similarity-aware illumination module (KSIM) is introduced to dynamically adjust the weighting of RGB and thermal features by incorporating both illumination intensity and the similarity between modalities. KSIM can fine-tune the contribution of each modality during fusion, ensuring a more precise balance that improves recognition performance across diverse conditions. Experimental results on the DroneVehicle and VEDAI datasets show that DENet performs outstandingly in multimodal object detection tasks, effectively improving detection accuracy and reducing the impact of modality failure on performance.\",\"PeriodicalId\":13116,\"journal\":{\"name\":\"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing\",\"volume\":\"18 \",\"pages\":\"3000-3010\"},\"PeriodicalIF\":4.7000,\"publicationDate\":\"2024-12-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10817087\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10817087/\",\"RegionNum\":2,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10817087/","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

摘要

目前,rgb -热(RGB-T)目标检测算法已经证明了出色的性能,但雾、强光、传感器损坏等条件导致的模态失效等问题会严重影响探测器的性能。本文提出了一种名为扩散增强目标检测网络(diffusion enhanced object detection network, DENet)的多模态目标检测方法,旨在解决非常规环境、传感器异常等因素导致的模态失效问题,同时抑制多模态数据中的冗余信息,提高模型精度。首先,设计了基于扩散模型的多维增量信息生成模块,利用原始融合特征图通过反向扩散过程重构RGB-T图像的不稳定信息;为了进一步解决现有RGB-T目标检测模型中存在的冗余信息问题,引入冗余信息抑制模块,基于互信息和对比损失最小化跨模态冗余信息。最后,引入核相似度感知照明模块(KSIM),结合光照强度和模态之间的相似度,动态调整RGB和热特征的权重。KSIM可以在融合过程中微调每种模式的贡献,确保更精确的平衡,从而提高不同条件下的识别性能。在dronevvehicle和VEDAI数据集上的实验结果表明,DENet在多模态目标检测任务中表现出色,有效提高了检测精度,减少了模态失效对性能的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
RGB-T Object Detection With Failure Scenarios
Currently, RGB-thermal (RGB-T) object detection algorithms have demonstrated excellent performance, but issues such as modality failure caused by fog, strong light, sensor damage, and other conditions can significantly impact the detector's performance. This article proposes a multimodal object detection method named diffusion enhanced object detection network (DENet), aiming to address modality failure problems caused by nonroutine environments, sensor anomalies, and other factors, while suppressing redundant information in multimodal data to improve model accuracy. First, we design a multidimensional incremental information generation module based on a diffusion model, which reconstructs the unstable information of RGB-T images through the reverse diffusion process using the original fusion feature map. To further address the issue of redundant information in existing RGB-T object detection models, a redundant information suppression module is introduced, minimizing cross-modal redundant information based on mutual information and contrastive loss. Finally, a kernel similarity-aware illumination module (KSIM) is introduced to dynamically adjust the weighting of RGB and thermal features by incorporating both illumination intensity and the similarity between modalities. KSIM can fine-tune the contribution of each modality during fusion, ensuring a more precise balance that improves recognition performance across diverse conditions. Experimental results on the DroneVehicle and VEDAI datasets show that DENet performs outstandingly in multimodal object detection tasks, effectively improving detection accuracy and reducing the impact of modality failure on performance.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
9.30
自引率
10.90%
发文量
563
审稿时长
4.7 months
期刊介绍: The IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing addresses the growing field of applications in Earth observations and remote sensing, and also provides a venue for the rapidly expanding special issues that are being sponsored by the IEEE Geosciences and Remote Sensing Society. The journal draws upon the experience of the highly successful “IEEE Transactions on Geoscience and Remote Sensing” and provide a complementary medium for the wide range of topics in applied earth observations. The ‘Applications’ areas encompasses the societal benefit areas of the Global Earth Observations Systems of Systems (GEOSS) program. Through deliberations over two years, ministers from 50 countries agreed to identify nine areas where Earth observation could positively impact the quality of life and health of their respective countries. Some of these are areas not traditionally addressed in the IEEE context. These include biodiversity, health and climate. Yet it is the skill sets of IEEE members, in areas such as observations, communications, computers, signal processing, standards and ocean engineering, that form the technical underpinnings of GEOSS. Thus, the Journal attracts a broad range of interests that serves both present members in new ways and expands the IEEE visibility into new areas.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信