Improved YOLOv8 algorithms for small object detection in aerial imagery

IF 5.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of King Saud University-Computer and Information Sciences Pub Date : 2024-07-01 DOI:10.1016/j.jksuci.2024.102113

Fei Feng, Yu Hu, Weipeng Li, Feiyan Yang

{"title":"Improved YOLOv8 algorithms for small object detection in aerial imagery","authors":"Fei Feng, Yu Hu, Weipeng Li, Feiyan Yang","doi":"10.1016/j.jksuci.2024.102113","DOIUrl":null,"url":null,"abstract":"<div><p>In drone aerial target detection tasks, a high proportion of small targets and complex backgrounds often lead to false positives and missed detections, resulting in low detection accuracy. To improve the accuracy of the detection of small targets, this study proposes two improved models based on YOLOv8s, named IMCMD_YOLOv8_small and IMCMD_YOLOv8_large. Each model accommodates different application scenarios. First, the network structure was optimized by removing the backbone P5 layer used to detect large targets and merging the P4, P3, and P2 layers, which are better suited for detecting medium and small targets; P3 and P2 serve as detection heads to focus more on small targets. Subsequently, the coordinate attention mechanism is integrated into the backbone’s C2f, to create a C2f_CA module that enhances the model’ s focus on key information and secures a richer flow of gradient information. Subsequently, a multiscale attention feature fusion module was designed to merge the shallow and deep features. Finally, a Dynamic Head was introduced to unify the perception of scale, space, and tasks, further enhancing the detection capability for small targets. Experimental results on the VisDrone2019 dataset demonstrated that, compared with YOLOv8s, IMCMD_YOLOv8_small achieved improvements of 7.7% and 5.1% in [email protected] and [email protected]:0.95, respectively, with a 73.0% reduction in the parameter count. The IMCMD_YOLOv8_large model showed even more significant improvements in these metrics, reaching 10.8% and 7.3%, respectively, with a 47.7% reduction in the parameter count, displaying superior performance in small target detection tasks. The improved models not only enhanced the detection accuracy but also achieved model lightweighting, thereby proving the effectiveness of the improvement strategies and showcasing superior performance compared with other classic models.</p></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":null,"pages":null},"PeriodicalIF":5.2000,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1319157824002027/pdfft?md5=8bdeb619d762fdc2367a02f8611772c3&pid=1-s2.0-S1319157824002027-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of King Saud University-Computer and Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1319157824002027","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

In drone aerial target detection tasks, a high proportion of small targets and complex backgrounds often lead to false positives and missed detections, resulting in low detection accuracy. To improve the accuracy of the detection of small targets, this study proposes two improved models based on YOLOv8s, named IMCMD_YOLOv8_small and IMCMD_YOLOv8_large. Each model accommodates different application scenarios. First, the network structure was optimized by removing the backbone P5 layer used to detect large targets and merging the P4, P3, and P2 layers, which are better suited for detecting medium and small targets; P3 and P2 serve as detection heads to focus more on small targets. Subsequently, the coordinate attention mechanism is integrated into the backbone’s C2f, to create a C2f_CA module that enhances the model’ s focus on key information and secures a richer flow of gradient information. Subsequently, a multiscale attention feature fusion module was designed to merge the shallow and deep features. Finally, a Dynamic Head was introduced to unify the perception of scale, space, and tasks, further enhancing the detection capability for small targets. Experimental results on the VisDrone2019 dataset demonstrated that, compared with YOLOv8s, IMCMD_YOLOv8_small achieved improvements of 7.7% and 5.1% in [email protected] and [email protected]:0.95, respectively, with a 73.0% reduction in the parameter count. The IMCMD_YOLOv8_large model showed even more significant improvements in these metrics, reaching 10.8% and 7.3%, respectively, with a 47.7% reduction in the parameter count, displaying superior performance in small target detection tasks. The improved models not only enhanced the detection accuracy but also achieved model lightweighting, thereby proving the effectiveness of the improvement strategies and showcasing superior performance compared with other classic models.

查看原文本刊更多论文

改进 YOLOv8 算法，用于航空图像中的小物体检测

在无人机空中目标检测任务中，小目标和复杂背景所占比例较高，往往会导致误报和漏检，从而降低检测精度。为了提高小型目标的检测精度，本研究提出了两个基于 YOLOv8 的改进模型，分别命名为 IMCMD_YOLOv8_small 和 IMCMD_YOLOv8_large。每个模型都能适应不同的应用场景。首先，对网络结构进行了优化，删除了用于检测大型目标的主干 P5 层，合并了更适合检测中小型目标的 P4、P3 和 P2 层；P3 和 P2 作为检测头，更专注于小型目标。随后，将协调注意力机制整合到主干的 C2f 中，创建 C2f_CA 模块，以加强模型对关键信息的关注，并确保更丰富的梯度信息流。随后，设计了一个多尺度注意力特征融合模块，以融合浅层和深层特征。最后，还引入了动态头，以统一对规模、空间和任务的感知，进一步提高对小型目标的检测能力。在VisDrone2019数据集上的实验结果表明，与YOLOv8s相比，IMCMD_YOLOv8_small在[email protected]和[email protected]:0.95方面分别提高了7.7%和5.1%，参数数量减少了73.0%。IMCMD_YOLOv8_large 模型在这些指标上的改进更为显著，分别达到了 10.8% 和 7.3%，参数数量减少了 47.7%，在小型目标检测任务中表现出了卓越的性能。改进后的模型不仅提高了检测精度，还实现了模型轻量化，从而证明了改进策略的有效性，并展示了与其他经典模型相比的卓越性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of King Saud University-Computer and Information Sciences COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

10.50

自引率

8.70%

发文量

656

审稿时长

29 days

期刊介绍： In 2022 the Journal of King Saud University - Computer and Information Sciences will become an author paid open access journal. Authors who submit their manuscript after October 31st 2021 will be asked to pay an Article Processing Charge (APC) after acceptance of their paper to make their work immediately, permanently, and freely accessible to all. The Journal of King Saud University Computer and Information Sciences is a refereed, international journal that covers all aspects of both foundations of computer and its practical applications.

文献相关原料

公司名称	产品信息	采购帮参考价格