S-YOLO: An enhanced small object detection method based on adaptive gating strategy and dynamic multi-scale focus module.

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Neural Networks Pub Date : 2025-11-01 Epub Date: 2025-07-01 DOI:10.1016/j.neunet.2025.107782

Zengnan Wang, Feng Yan, Liejun Wang, Yabo Yin, Jiahuan Lin

{"title":"S-YOLO: An enhanced small object detection method based on adaptive gating strategy and dynamic multi-scale focus module.","authors":"Zengnan Wang, Feng Yan, Liejun Wang, Yabo Yin, Jiahuan Lin","doi":"10.1016/j.neunet.2025.107782","DOIUrl":null,"url":null,"abstract":"<p><p>Detecting small objects in drone aerial imagery presents significant challenges, particularly when algorithms must operate in real-time under computational constraints. To address this issue, we propose S-YOLO, an efficient and streamlined small object detection framework based on YOLOv10. The S-YOLO architecture emphasizes three key innovations: (1) Enhanced Small Object Detection Layers: These layers augment semantic richness to improve detection of diminutive targets. (2) C2fGCU Module: Incorporating Gated Convolutional Units (GCU), this module adaptively modulates activation strength through deep feature analysis, enabling the model to concentrate on salient information while effectively mitigating background interference. (3) Dynamic Multi-Scale Fusion (DMSF) Module: By integrating SE-Norm with multi-scale feature extraction, this component dynamically recalibrates feature weights to optimize cross-scale information integration and focus. S-YOLO surpasses YOLOv10-n, achieving mAP50:95 improvements of 5.3%, 4.4%, and 1.4% on the VisDrone2019, AI-TOD, and DOTA1.0 datasets, respectively. Notably, S-YOLO maintains fewer parameters than YOLOv10-n while processing 285 images per second, establishing it as a highly efficient solution for real-time small object detection in aerial imagery.</p>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"191 ","pages":"107782"},"PeriodicalIF":6.3000,"publicationDate":"2025-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1016/j.neunet.2025.107782","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/7/1 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Detecting small objects in drone aerial imagery presents significant challenges, particularly when algorithms must operate in real-time under computational constraints. To address this issue, we propose S-YOLO, an efficient and streamlined small object detection framework based on YOLOv10. The S-YOLO architecture emphasizes three key innovations: (1) Enhanced Small Object Detection Layers: These layers augment semantic richness to improve detection of diminutive targets. (2) C2fGCU Module: Incorporating Gated Convolutional Units (GCU), this module adaptively modulates activation strength through deep feature analysis, enabling the model to concentrate on salient information while effectively mitigating background interference. (3) Dynamic Multi-Scale Fusion (DMSF) Module: By integrating SE-Norm with multi-scale feature extraction, this component dynamically recalibrates feature weights to optimize cross-scale information integration and focus. S-YOLO surpasses YOLOv10-n, achieving mAP50:95 improvements of 5.3%, 4.4%, and 1.4% on the VisDrone2019, AI-TOD, and DOTA1.0 datasets, respectively. Notably, S-YOLO maintains fewer parameters than YOLOv10-n while processing 285 images per second, establishing it as a highly efficient solution for real-time small object detection in aerial imagery.

查看原文本刊更多论文

S-YOLO：一种基于自适应门控策略和动态多尺度聚焦模块的增强小目标检测方法。

在无人机航拍图像中检测小物体面临着巨大的挑战，特别是当算法必须在计算限制下实时运行时。为了解决这一问题，我们提出了基于YOLOv10的高效精简小目标检测框架S-YOLO。S-YOLO架构强调三个关键创新：(1)增强的小目标检测层：这些层增加了语义丰富度，以提高对小目标的检测。(2) C2fGCU模块：该模块采用门控卷积单元（Gated Convolutional Units， GCU），通过深度特征分析自适应调节激活强度，使模型能够在集中突出信息的同时有效减轻背景干扰。(3)动态多尺度融合（Dynamic Multi-Scale Fusion， DMSF）模块：该模块通过将SE-Norm与多尺度特征提取相结合，动态重新校准特征权重，优化跨尺度信息集成与聚焦。S-YOLO超过了YOLOv10-n，在VisDrone2019、AI-TOD和DOTA1.0数据集上分别实现了5.3%、4.4%和1.4%的mAP50:95改进。值得注意的是，S-YOLO在每秒处理285张图像的同时，保持的参数比YOLOv10-n更少，这使其成为航空图像中实时小目标检测的高效解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Neural Networks 工程技术-计算机：人工智能

CiteScore

13.90

自引率

7.70%

发文量

425

审稿时长

67 days

期刊介绍： Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.