{"title":"HMMSC-YOLO: A Comprehensively Improved Small Target Detection Algorithm","authors":"Chongyang Fan, Wenfang Li, Chang Lin","doi":"10.1002/cpe.70288","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>This study addressed the challenges of small target detection in aerial imaging applications, including limited pixel coverage, weak feature representation, and complex background interference, by proposing a collaborative optimisation algorithm named HMMSC-YOLO. Firstly, a CNN-Transformer heterogeneous feature interaction network was constructed to mitigate high-frequency information attenuation during hierarchical transmission of small targets. Secondly, a parameter-shared dilated convolutional chain structure was designed, employing a weight-reuse strategy across multi-branch heterogeneous receptive fields to enhance geometric feature sensitivity towards minuscule targets. A differentiable affine transformation-guided multi-kernel dynamic fusion mechanism was further developed, achieving high-precision geometric alignment of cross-scale features through learnable deformation fields, thereby overcoming the rigid fusion limitations of conventional feature pyramids. A dual-attention-driven feature recalibration architecture was introduced to improve target localisation robustness under complex background interference. Finally, a dual-path collaborative downsampling module was implemented to suppress feature confusion caused by traditional single-path downsampling. Experimental evaluations on the VisDrone2019 dataset demonstrated 1.4% and 1% improvements in mAP50 and mAP50:95 metrics respectively compared to baseline models, alongside 23.3% and 2.5% reductions in parameter quantity and computational costs. The algorithm exhibited superior localisation accuracy and occlusion resistance in dense small target scenarios, establishing an innovative technical framework for practical applications including aerial image analysis and low-light environmental monitoring.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 25-26","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70288","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
This study addressed the challenges of small target detection in aerial imaging applications, including limited pixel coverage, weak feature representation, and complex background interference, by proposing a collaborative optimisation algorithm named HMMSC-YOLO. Firstly, a CNN-Transformer heterogeneous feature interaction network was constructed to mitigate high-frequency information attenuation during hierarchical transmission of small targets. Secondly, a parameter-shared dilated convolutional chain structure was designed, employing a weight-reuse strategy across multi-branch heterogeneous receptive fields to enhance geometric feature sensitivity towards minuscule targets. A differentiable affine transformation-guided multi-kernel dynamic fusion mechanism was further developed, achieving high-precision geometric alignment of cross-scale features through learnable deformation fields, thereby overcoming the rigid fusion limitations of conventional feature pyramids. A dual-attention-driven feature recalibration architecture was introduced to improve target localisation robustness under complex background interference. Finally, a dual-path collaborative downsampling module was implemented to suppress feature confusion caused by traditional single-path downsampling. Experimental evaluations on the VisDrone2019 dataset demonstrated 1.4% and 1% improvements in mAP50 and mAP50:95 metrics respectively compared to baseline models, alongside 23.3% and 2.5% reductions in parameter quantity and computational costs. The algorithm exhibited superior localisation accuracy and occlusion resistance in dense small target scenarios, establishing an innovative technical framework for practical applications including aerial image analysis and low-light environmental monitoring.
期刊介绍:
Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of:
Parallel and distributed computing;
High-performance computing;
Computational and data science;
Artificial intelligence and machine learning;
Big data applications, algorithms, and systems;
Network science;
Ontologies and semantics;
Security and privacy;
Cloud/edge/fog computing;
Green computing; and
Quantum computing.