Unified diffusion-based object detection in multi-modal and low-light remote sensing images

IF 0.7 4区 工程技术 Q4 ENGINEERING, ELECTRICAL & ELECTRONIC
Xu Sun, Yinhui Yu, Qing Cheng
{"title":"Unified diffusion-based object detection in multi-modal and low-light remote sensing images","authors":"Xu Sun,&nbsp;Yinhui Yu,&nbsp;Qing Cheng","doi":"10.1049/ell2.70093","DOIUrl":null,"url":null,"abstract":"<p>Remote sensing object detection remains a challenge under complex conditions such as low light, adverse weather, modality attacks or losses. Previous approaches typically alleviate this problem by enhancing visible images or leveraging multi-modal fusion technologies. In view of this, the authors propose a unified framework based on YOLO-World that combines the advantages of both schemes, achieving more adaptable and robust remote sensing object detection in complex real-world scenarios. This framework introduces a unified modality modelling strategy, allowing the model to learn abundant object features from multiple remote sensing datasets. Additionally, a U-fusion neck based on the diffusion method is designed to effectively remove modality-specific noise and generate missing complementary features. Extensive experiments were conducted on four remote sensing image datasets: Multimodal VEDAI, DroneVehicle, unimodal VisDrone and UAVDT. This approach achieves average precision scores of 50.5<span></span><math>\n <semantics>\n <mo>%</mo>\n <annotation>$\\%$</annotation>\n </semantics></math>, 55.3<span></span><math>\n <semantics>\n <mo>%</mo>\n <annotation>$\\%$</annotation>\n </semantics></math>, 25.1<span></span><math>\n <semantics>\n <mo>%</mo>\n <annotation>$\\%$</annotation>\n </semantics></math>, and 20.7<span></span><math>\n <semantics>\n <mo>%</mo>\n <annotation>$\\%$</annotation>\n </semantics></math>, which outperforms advanced multimodal remote sensing object detection methods and low-light image enhancement techniques.</p>","PeriodicalId":11556,"journal":{"name":"Electronics Letters","volume":"60 22","pages":""},"PeriodicalIF":0.7000,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1049/ell2.70093","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Electronics Letters","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1049/ell2.70093","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

Remote sensing object detection remains a challenge under complex conditions such as low light, adverse weather, modality attacks or losses. Previous approaches typically alleviate this problem by enhancing visible images or leveraging multi-modal fusion technologies. In view of this, the authors propose a unified framework based on YOLO-World that combines the advantages of both schemes, achieving more adaptable and robust remote sensing object detection in complex real-world scenarios. This framework introduces a unified modality modelling strategy, allowing the model to learn abundant object features from multiple remote sensing datasets. Additionally, a U-fusion neck based on the diffusion method is designed to effectively remove modality-specific noise and generate missing complementary features. Extensive experiments were conducted on four remote sensing image datasets: Multimodal VEDAI, DroneVehicle, unimodal VisDrone and UAVDT. This approach achieves average precision scores of 50.5 % $\%$ , 55.3 % $\%$ , 25.1 % $\%$ , and 20.7 % $\%$ , which outperforms advanced multimodal remote sensing object detection methods and low-light image enhancement techniques.

Abstract Image

在多模态和低照度遥感图像中进行基于扩散的统一物体检测
在光线不足、恶劣天气、模态攻击或丢失等复杂条件下,遥感物体检测仍然是一项挑战。以往的方法通常是通过增强可见光图像或利用多模态融合技术来缓解这一问题。有鉴于此,作者提出了一个基于 YOLO-World 的统一框架,该框架结合了两种方案的优势,在复杂的现实世界场景中实现了更具适应性和鲁棒性的遥感目标检测。该框架引入了统一的模态建模策略,允许模型从多个遥感数据集中学习丰富的物体特征。此外,还设计了基于扩散方法的 U 型融合颈,以有效去除特定模态噪声并生成缺失的互补特征。在四个遥感图像数据集上进行了广泛的实验:多模态 VEDAI、DroneVehicle、单模态 VisDrone 和 UAVDT。该方法的平均精确度分别为50.5% ($/%$)、55.3% ($/%$)、25.1% ($/%$)和20.7% ($/%$),优于先进的多模态遥感物体检测方法和弱光图像增强技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Electronics Letters
Electronics Letters 工程技术-工程:电子与电气
CiteScore
2.70
自引率
0.00%
发文量
268
审稿时长
3.6 months
期刊介绍: Electronics Letters is an internationally renowned peer-reviewed rapid-communication journal that publishes short original research papers every two weeks. Its broad and interdisciplinary scope covers the latest developments in all electronic engineering related fields including communication, biomedical, optical and device technologies. Electronics Letters also provides further insight into some of the latest developments through special features and interviews. Scope As a journal at the forefront of its field, Electronics Letters publishes papers covering all themes of electronic and electrical engineering. The major themes of the journal are listed below. Antennas and Propagation Biomedical and Bioinspired Technologies, Signal Processing and Applications Control Engineering Electromagnetism: Theory, Materials and Devices Electronic Circuits and Systems Image, Video and Vision Processing and Applications Information, Computing and Communications Instrumentation and Measurement Microwave Technology Optical Communications Photonics and Opto-Electronics Power Electronics, Energy and Sustainability Radar, Sonar and Navigation Semiconductor Technology Signal Processing MIMO
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信