DR-YOLO：基于YOLOv7改进的无人机航拍场景多尺度小目标检测模型

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2025-04-28 DOI:10.1016/j.dsp.2025.105265

Hongbo Bi, Rui Dai, Fengyang Han, Cong Zhang

{"title":"DR-YOLO：基于YOLOv7改进的无人机航拍场景多尺度小目标检测模型","authors":"Hongbo Bi, Rui Dai, Fengyang Han, Cong Zhang","doi":"10.1016/j.dsp.2025.105265","DOIUrl":null,"url":null,"abstract":"<div><div>With the advancement of drone technology, detecting and recognizing ground targets from aerial perspectives has become crucial in various drone applications. However, object detection in drone imagery poses several challenges, including the prevalence of small targets, the significant impact of aerial perspectives, variations in target scales, complex backgrounds, and frequent occlusions. To address these issues, we propose DR-YOLO, a multi-scale target detection model specifically designed for aerial drone images, building upon the YOLOv7 framework. We introduce the Spatial Pyramid Pooling with Dilated Convolutions (SPPDSPC) module to enhance dense target feature extraction. Additionally, we incorporate a decoupled detection head tailored for small objects and redesign the number and sizes of detection heads. To handle complex backgrounds and varying target sizes, we embed the Multi-Scale Feature Fusion (HTLF) Module into the feature pyramid network, providing rich spatial information for detection heads of different scales. Furthermore, we utilize the Gaussian Wasserstein Distance (GWD) to refine the regression loss, leading to improved bounding box quality, faster convergence, and higher accuracy in small object detection. Experimental results on the VisDrone2019 dataset demonstrate a 14.8% increase in [email protected] and a 9.8% increase in [email protected] compared to the baseline YOLOv7, validating the effectiveness of DR-YOLO in detecting objects within aerial drone imagery. The code and results of our method are available at <span><span>https://github.com/DRdairuiDR/DR-YOLO</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"164 ","pages":"Article 105265"},"PeriodicalIF":2.9000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"DR-YOLO: An improved multi-scale small object detection model for drone aerial photography scenes based on YOLOv7\",\"authors\":\"Hongbo Bi, Rui Dai, Fengyang Han, Cong Zhang\",\"doi\":\"10.1016/j.dsp.2025.105265\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>With the advancement of drone technology, detecting and recognizing ground targets from aerial perspectives has become crucial in various drone applications. However, object detection in drone imagery poses several challenges, including the prevalence of small targets, the significant impact of aerial perspectives, variations in target scales, complex backgrounds, and frequent occlusions. To address these issues, we propose DR-YOLO, a multi-scale target detection model specifically designed for aerial drone images, building upon the YOLOv7 framework. We introduce the Spatial Pyramid Pooling with Dilated Convolutions (SPPDSPC) module to enhance dense target feature extraction. Additionally, we incorporate a decoupled detection head tailored for small objects and redesign the number and sizes of detection heads. To handle complex backgrounds and varying target sizes, we embed the Multi-Scale Feature Fusion (HTLF) Module into the feature pyramid network, providing rich spatial information for detection heads of different scales. Furthermore, we utilize the Gaussian Wasserstein Distance (GWD) to refine the regression loss, leading to improved bounding box quality, faster convergence, and higher accuracy in small object detection. Experimental results on the VisDrone2019 dataset demonstrate a 14.8% increase in [email protected] and a 9.8% increase in [email protected] compared to the baseline YOLOv7, validating the effectiveness of DR-YOLO in detecting objects within aerial drone imagery. The code and results of our method are available at <span><span>https://github.com/DRdairuiDR/DR-YOLO</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"164 \",\"pages\":\"Article 105265\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425002878\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425002878","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

随着无人机技术的进步，从空中角度探测和识别地面目标在无人机的各种应用中变得至关重要。然而，无人机图像中的目标检测带来了一些挑战，包括小目标的普遍存在、空中视角的重大影响、目标尺度的变化、复杂的背景和频繁的遮挡。为了解决这些问题，我们在YOLOv7框架的基础上提出了DR-YOLO，一种专门为航空无人机图像设计的多尺度目标检测模型。引入扩展卷积空间金字塔池（SPPDSPC）模块来增强密集目标特征提取。此外，我们还结合了一个为小物体量身定制的解耦检测头，并重新设计了检测头的数量和尺寸。为了处理复杂的背景和不同的目标尺寸，我们将多尺度特征融合（HTLF）模块嵌入到特征金字塔网络中，为不同尺度的检测头提供丰富的空间信息。此外，我们利用高斯沃瑟斯坦距离（GWD）来细化回归损失，从而提高边界盒质量，加快收敛速度，提高小目标检测的精度。在VisDrone2019数据集上的实验结果表明，与基线YOLOv7相比，[email protected]和[email protected]分别增加了14.8%和9.8%，验证了DR-YOLO在检测空中无人机图像中的目标方面的有效性。我们的方法的代码和结果可在https://github.com/DRdairuiDR/DR-YOLO上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

DR-YOLO: An improved multi-scale small object detection model for drone aerial photography scenes based on YOLOv7

With the advancement of drone technology, detecting and recognizing ground targets from aerial perspectives has become crucial in various drone applications. However, object detection in drone imagery poses several challenges, including the prevalence of small targets, the significant impact of aerial perspectives, variations in target scales, complex backgrounds, and frequent occlusions. To address these issues, we propose DR-YOLO, a multi-scale target detection model specifically designed for aerial drone images, building upon the YOLOv7 framework. We introduce the Spatial Pyramid Pooling with Dilated Convolutions (SPPDSPC) module to enhance dense target feature extraction. Additionally, we incorporate a decoupled detection head tailored for small objects and redesign the number and sizes of detection heads. To handle complex backgrounds and varying target sizes, we embed the Multi-Scale Feature Fusion (HTLF) Module into the feature pyramid network, providing rich spatial information for detection heads of different scales. Furthermore, we utilize the Gaussian Wasserstein Distance (GWD) to refine the regression loss, leading to improved bounding box quality, faster convergence, and higher accuracy in small object detection. Experimental results on the VisDrone2019 dataset demonstrate a 14.8% increase in [email protected] and a 9.8% increase in [email protected] compared to the baseline YOLOv7, validating the effectiveness of DR-YOLO in detecting objects within aerial drone imagery. The code and results of our method are available at https://github.com/DRdairuiDR/DR-YOLO.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,