MSConv-YOLO: An Improved Small Target Detection Algorithm Based on YOLOv8.

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging Pub Date : 2025-08-21 DOI:10.3390/jimaging11080285

Linli Yang, Barmak Honarvar Shakibaei Asli

{"title":"MSConv-YOLO: An Improved Small Target Detection Algorithm Based on YOLOv8.","authors":"Linli Yang, Barmak Honarvar Shakibaei Asli","doi":"10.3390/jimaging11080285","DOIUrl":null,"url":null,"abstract":"<p><p>Small object detection in UAV aerial imagery presents significant challenges due to scale variations, sparse feature representation, and complex backgrounds. To address these issues, this paper focuses on practical engineering improvements to the existing YOLOv8s framework, rather than proposing a fundamentally new algorithm. We introduce MultiScaleConv-YOLO (MSConv-YOLO), an enhanced model that integrates well-established techniques to improve detection performance for small targets. Specifically, the proposed approach introduces three key improvements: (1) a MultiScaleConv (MSConv) module that combines depthwise separable and dilated convolutions with varying dilation rates, enhancing multi-scale feature extraction while maintaining efficiency; (2) the replacement of CIoU with WIoU v3 as the bounding box regression loss, which incorporates a dynamic non-monotonic focusing mechanism to improve localization for small targets; and (3) the addition of a high-resolution detection head in the neck-head structure, leveraging FPN and PAN to preserve fine-grained features and ensure full-scale coverage. Experimental results on the VisDrone2019 dataset show that MSConv-YOLO outperforms the baseline YOLOv8s by achieving a 6.9% improvement in mAP@0.5 and a 6.3% gain in recall. Ablation studies further validate the complementary impact of each enhancement. This paper presents practical and effective engineering enhancements to small object detection in UAV scenarios, offering an improved solution without introducing entirely new theoretical constructs. Future work will focus on lightweight deployment and adaptation to more complex environments.</p>","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"11 8","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12387663/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Imaging","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/jimaging11080285","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Small object detection in UAV aerial imagery presents significant challenges due to scale variations, sparse feature representation, and complex backgrounds. To address these issues, this paper focuses on practical engineering improvements to the existing YOLOv8s framework, rather than proposing a fundamentally new algorithm. We introduce MultiScaleConv-YOLO (MSConv-YOLO), an enhanced model that integrates well-established techniques to improve detection performance for small targets. Specifically, the proposed approach introduces three key improvements: (1) a MultiScaleConv (MSConv) module that combines depthwise separable and dilated convolutions with varying dilation rates, enhancing multi-scale feature extraction while maintaining efficiency; (2) the replacement of CIoU with WIoU v3 as the bounding box regression loss, which incorporates a dynamic non-monotonic focusing mechanism to improve localization for small targets; and (3) the addition of a high-resolution detection head in the neck-head structure, leveraging FPN and PAN to preserve fine-grained features and ensure full-scale coverage. Experimental results on the VisDrone2019 dataset show that MSConv-YOLO outperforms the baseline YOLOv8s by achieving a 6.9% improvement in mAP@0.5 and a 6.3% gain in recall. Ablation studies further validate the complementary impact of each enhancement. This paper presents practical and effective engineering enhancements to small object detection in UAV scenarios, offering an improved solution without introducing entirely new theoretical constructs. Future work will focus on lightweight deployment and adaptation to more complex environments.

Abstract Image

查看原文本刊更多论文

基于YOLOv8的改进小目标检测算法msconvo - yolo

由于尺度变化、特征表示稀疏和背景复杂，无人机航拍图像中的小目标检测面临重大挑战。为了解决这些问题，本文将重点放在对现有YOLOv8s框架的实际工程改进上，而不是提出一个全新的算法。我们介绍了MultiScaleConv-YOLO (msconvv - yolo)，这是一种增强模型，集成了成熟的技术，以提高对小目标的检测性能。具体而言，该方法引入了三个关键改进：(1)MultiScaleConv （MSConv）模块，该模块结合了不同扩展率的深度可分卷积和扩展卷积，在保持效率的同时增强了多尺度特征提取；(2)用WIoU v3代替CIoU作为边界盒回归损失，采用动态非单调聚焦机制，提高小目标的定位能力；(3)在颈头结构中增加一个高分辨率检测头，利用FPN和PAN来保留细粒度特征并确保全面覆盖。在VisDrone2019数据集上的实验结果表明，msconvo - yolo优于基准YOLOv8s，在mAP@0.5上实现了6.9%的改进，召回率提高了6.3%。消融研究进一步验证了每种增强的互补影响。本文提出了在无人机场景中对小目标检测的实际有效的工程增强，提供了一种改进的解决方案，而无需引入全新的理论结构。未来的工作将集中在轻量级部署和适应更复杂的环境上。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊