YOLO-MFDNet: An object detection algorithm for multi-scale remote sensing images

IF 2.9 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Renhao Jiao , Rui Fan , Weigui Nan , Ming Lu , Xiaojia Yang , Zhiqiang Zhao , Jin Dang , Yanshan Tian , Baiying Dong , Xiaowei He , Xiaoli Luo
{"title":"YOLO-MFDNet: An object detection algorithm for multi-scale remote sensing images","authors":"Renhao Jiao ,&nbsp;Rui Fan ,&nbsp;Weigui Nan ,&nbsp;Ming Lu ,&nbsp;Xiaojia Yang ,&nbsp;Zhiqiang Zhao ,&nbsp;Jin Dang ,&nbsp;Yanshan Tian ,&nbsp;Baiying Dong ,&nbsp;Xiaowei He ,&nbsp;Xiaoli Luo","doi":"10.1016/j.dsp.2025.105479","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of remote sensing image target detection, although there have been many research progresses, problems such as complex background and multi-scale changes are still prominent. To this end, this paper proposes a new detection network - YOLO-MFDNet, which aims to enhance the multi-scale target perception ability and improve the detection accuracy. The network includes three key innovations: multi-scale spatial attention (MSSA) mechanism, flexible scaling down sampling (FSDown) mechanism and distance extended IOU (DXIOU) loss function. MSSA combines multi-scale feature fusion and dual-space dimension one-dimensional coding to enhance the spatial representation ability of the target and efficiently integrate multi-scale information. FSDown combines the advantages of depthwise separable convolution, dilated convolution and residual connection to improve the receptive field while maintaining the sensitivity to detail features, taking into account the detection accuracy and computational efficiency. The DXIOU loss function effectively reduces the risk of false detection and missed detection by introducing scale difference modeling. In this paper, the effectiveness of YOLO-MFDNet is verified on three public remote sensing datasets. On the DOTA v2.0 dataset, the mAP50 of YOLO-MFDNet is 2.7 % higher than that of the benchmark model; increase by 1.1 % on the DIOR dataset; it is improved by 6 % on the RSOD dataset, surpassing multiple existing models. In the case of little change in the number of parameters, YOLO-MFDNet shows higher detection accuracy on multiple data sets, which verifies its advantages in improving detection performance under the premise of ensuring computational efficiency. The source code will be available at <span><span>https://github.com/stevenjiaojiao/YOLO-MFDNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105479"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425005019","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

In the field of remote sensing image target detection, although there have been many research progresses, problems such as complex background and multi-scale changes are still prominent. To this end, this paper proposes a new detection network - YOLO-MFDNet, which aims to enhance the multi-scale target perception ability and improve the detection accuracy. The network includes three key innovations: multi-scale spatial attention (MSSA) mechanism, flexible scaling down sampling (FSDown) mechanism and distance extended IOU (DXIOU) loss function. MSSA combines multi-scale feature fusion and dual-space dimension one-dimensional coding to enhance the spatial representation ability of the target and efficiently integrate multi-scale information. FSDown combines the advantages of depthwise separable convolution, dilated convolution and residual connection to improve the receptive field while maintaining the sensitivity to detail features, taking into account the detection accuracy and computational efficiency. The DXIOU loss function effectively reduces the risk of false detection and missed detection by introducing scale difference modeling. In this paper, the effectiveness of YOLO-MFDNet is verified on three public remote sensing datasets. On the DOTA v2.0 dataset, the mAP50 of YOLO-MFDNet is 2.7 % higher than that of the benchmark model; increase by 1.1 % on the DIOR dataset; it is improved by 6 % on the RSOD dataset, surpassing multiple existing models. In the case of little change in the number of parameters, YOLO-MFDNet shows higher detection accuracy on multiple data sets, which verifies its advantages in improving detection performance under the premise of ensuring computational efficiency. The source code will be available at https://github.com/stevenjiaojiao/YOLO-MFDNet.
多尺度遥感图像目标检测算法YOLO-MFDNet
在遥感图像目标检测领域,虽然已经取得了很多研究进展,但背景复杂、多尺度变化等问题仍然突出。为此,本文提出了一种新的检测网络——YOLO-MFDNet,该网络旨在增强多尺度目标感知能力,提高检测精度。该网络包括三个关键创新:多尺度空间注意(MSSA)机制、灵活缩小采样(FSDown)机制和距离扩展IOU (DXIOU)损失函数。MSSA结合多尺度特征融合和双空间维度一维编码,增强了目标的空间表示能力,有效整合了多尺度信息。FSDown结合了深度可分离卷积、扩展卷积和残差连接的优点,在保持对细节特征敏感性的同时,在兼顾检测精度和计算效率的前提下,提高了感受野。DXIOU损失函数通过引入尺度差分建模,有效降低了误检和漏检的风险。本文在三个公共遥感数据集上验证了YOLO-MFDNet的有效性。在DOTA v2.0数据集上,YOLO-MFDNet的mAP50比基准模型的mAP50高2.7%;在DIOR数据集上增加1.1%;在RSOD数据集上,它提高了6%,超过了多个现有模型。在参数数量变化不大的情况下,YOLO-MFDNet在多数据集上表现出更高的检测精度,验证了其在保证计算效率的前提下提高检测性能的优势。源代码可从https://github.com/stevenjiaojiao/YOLO-MFDNet获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Digital Signal Processing
Digital Signal Processing 工程技术-工程:电子与电气
CiteScore
5.30
自引率
17.20%
发文量
435
审稿时长
66 days
期刊介绍: Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信