Renhao Jiao , Rui Fan , Weigui Nan , Ming Lu , Xiaojia Yang , Zhiqiang Zhao , Jin Dang , Yanshan Tian , Baiying Dong , Xiaowei He , Xiaoli Luo
{"title":"YOLO-MFDNet: An object detection algorithm for multi-scale remote sensing images","authors":"Renhao Jiao , Rui Fan , Weigui Nan , Ming Lu , Xiaojia Yang , Zhiqiang Zhao , Jin Dang , Yanshan Tian , Baiying Dong , Xiaowei He , Xiaoli Luo","doi":"10.1016/j.dsp.2025.105479","DOIUrl":null,"url":null,"abstract":"<div><div>In the field of remote sensing image target detection, although there have been many research progresses, problems such as complex background and multi-scale changes are still prominent. To this end, this paper proposes a new detection network - YOLO-MFDNet, which aims to enhance the multi-scale target perception ability and improve the detection accuracy. The network includes three key innovations: multi-scale spatial attention (MSSA) mechanism, flexible scaling down sampling (FSDown) mechanism and distance extended IOU (DXIOU) loss function. MSSA combines multi-scale feature fusion and dual-space dimension one-dimensional coding to enhance the spatial representation ability of the target and efficiently integrate multi-scale information. FSDown combines the advantages of depthwise separable convolution, dilated convolution and residual connection to improve the receptive field while maintaining the sensitivity to detail features, taking into account the detection accuracy and computational efficiency. The DXIOU loss function effectively reduces the risk of false detection and missed detection by introducing scale difference modeling. In this paper, the effectiveness of YOLO-MFDNet is verified on three public remote sensing datasets. On the DOTA v2.0 dataset, the mAP50 of YOLO-MFDNet is 2.7 % higher than that of the benchmark model; increase by 1.1 % on the DIOR dataset; it is improved by 6 % on the RSOD dataset, surpassing multiple existing models. In the case of little change in the number of parameters, YOLO-MFDNet shows higher detection accuracy on multiple data sets, which verifies its advantages in improving detection performance under the premise of ensuring computational efficiency. The source code will be available at <span><span>https://github.com/stevenjiaojiao/YOLO-MFDNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105479"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425005019","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
In the field of remote sensing image target detection, although there have been many research progresses, problems such as complex background and multi-scale changes are still prominent. To this end, this paper proposes a new detection network - YOLO-MFDNet, which aims to enhance the multi-scale target perception ability and improve the detection accuracy. The network includes three key innovations: multi-scale spatial attention (MSSA) mechanism, flexible scaling down sampling (FSDown) mechanism and distance extended IOU (DXIOU) loss function. MSSA combines multi-scale feature fusion and dual-space dimension one-dimensional coding to enhance the spatial representation ability of the target and efficiently integrate multi-scale information. FSDown combines the advantages of depthwise separable convolution, dilated convolution and residual connection to improve the receptive field while maintaining the sensitivity to detail features, taking into account the detection accuracy and computational efficiency. The DXIOU loss function effectively reduces the risk of false detection and missed detection by introducing scale difference modeling. In this paper, the effectiveness of YOLO-MFDNet is verified on three public remote sensing datasets. On the DOTA v2.0 dataset, the mAP50 of YOLO-MFDNet is 2.7 % higher than that of the benchmark model; increase by 1.1 % on the DIOR dataset; it is improved by 6 % on the RSOD dataset, surpassing multiple existing models. In the case of little change in the number of parameters, YOLO-MFDNet shows higher detection accuracy on multiple data sets, which verifies its advantages in improving detection performance under the premise of ensuring computational efficiency. The source code will be available at https://github.com/stevenjiaojiao/YOLO-MFDNet.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,