{"title":"DR-YOLO:基于YOLOv7改进的无人机航拍场景多尺度小目标检测模型","authors":"Hongbo Bi, Rui Dai, Fengyang Han, Cong Zhang","doi":"10.1016/j.dsp.2025.105265","DOIUrl":null,"url":null,"abstract":"<div><div>With the advancement of drone technology, detecting and recognizing ground targets from aerial perspectives has become crucial in various drone applications. However, object detection in drone imagery poses several challenges, including the prevalence of small targets, the significant impact of aerial perspectives, variations in target scales, complex backgrounds, and frequent occlusions. To address these issues, we propose DR-YOLO, a multi-scale target detection model specifically designed for aerial drone images, building upon the YOLOv7 framework. We introduce the Spatial Pyramid Pooling with Dilated Convolutions (SPPDSPC) module to enhance dense target feature extraction. Additionally, we incorporate a decoupled detection head tailored for small objects and redesign the number and sizes of detection heads. To handle complex backgrounds and varying target sizes, we embed the Multi-Scale Feature Fusion (HTLF) Module into the feature pyramid network, providing rich spatial information for detection heads of different scales. Furthermore, we utilize the Gaussian Wasserstein Distance (GWD) to refine the regression loss, leading to improved bounding box quality, faster convergence, and higher accuracy in small object detection. Experimental results on the VisDrone2019 dataset demonstrate a 14.8% increase in [email protected] and a 9.8% increase in [email protected] compared to the baseline YOLOv7, validating the effectiveness of DR-YOLO in detecting objects within aerial drone imagery. The code and results of our method are available at <span><span>https://github.com/DRdairuiDR/DR-YOLO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"164 ","pages":"Article 105265"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DR-YOLO: An improved multi-scale small object detection model for drone aerial photography scenes based on YOLOv7\",\"authors\":\"Hongbo Bi, Rui Dai, Fengyang Han, Cong Zhang\",\"doi\":\"10.1016/j.dsp.2025.105265\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the advancement of drone technology, detecting and recognizing ground targets from aerial perspectives has become crucial in various drone applications. However, object detection in drone imagery poses several challenges, including the prevalence of small targets, the significant impact of aerial perspectives, variations in target scales, complex backgrounds, and frequent occlusions. To address these issues, we propose DR-YOLO, a multi-scale target detection model specifically designed for aerial drone images, building upon the YOLOv7 framework. We introduce the Spatial Pyramid Pooling with Dilated Convolutions (SPPDSPC) module to enhance dense target feature extraction. Additionally, we incorporate a decoupled detection head tailored for small objects and redesign the number and sizes of detection heads. To handle complex backgrounds and varying target sizes, we embed the Multi-Scale Feature Fusion (HTLF) Module into the feature pyramid network, providing rich spatial information for detection heads of different scales. Furthermore, we utilize the Gaussian Wasserstein Distance (GWD) to refine the regression loss, leading to improved bounding box quality, faster convergence, and higher accuracy in small object detection. Experimental results on the VisDrone2019 dataset demonstrate a 14.8% increase in [email protected] and a 9.8% increase in [email protected] compared to the baseline YOLOv7, validating the effectiveness of DR-YOLO in detecting objects within aerial drone imagery. The code and results of our method are available at <span><span>https://github.com/DRdairuiDR/DR-YOLO</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"164 \",\"pages\":\"Article 105265\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425002878\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425002878","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
DR-YOLO: An improved multi-scale small object detection model for drone aerial photography scenes based on YOLOv7
With the advancement of drone technology, detecting and recognizing ground targets from aerial perspectives has become crucial in various drone applications. However, object detection in drone imagery poses several challenges, including the prevalence of small targets, the significant impact of aerial perspectives, variations in target scales, complex backgrounds, and frequent occlusions. To address these issues, we propose DR-YOLO, a multi-scale target detection model specifically designed for aerial drone images, building upon the YOLOv7 framework. We introduce the Spatial Pyramid Pooling with Dilated Convolutions (SPPDSPC) module to enhance dense target feature extraction. Additionally, we incorporate a decoupled detection head tailored for small objects and redesign the number and sizes of detection heads. To handle complex backgrounds and varying target sizes, we embed the Multi-Scale Feature Fusion (HTLF) Module into the feature pyramid network, providing rich spatial information for detection heads of different scales. Furthermore, we utilize the Gaussian Wasserstein Distance (GWD) to refine the regression loss, leading to improved bounding box quality, faster convergence, and higher accuracy in small object detection. Experimental results on the VisDrone2019 dataset demonstrate a 14.8% increase in [email protected] and a 9.8% increase in [email protected] compared to the baseline YOLOv7, validating the effectiveness of DR-YOLO in detecting objects within aerial drone imagery. The code and results of our method are available at https://github.com/DRdairuiDR/DR-YOLO.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,