Xiaoyao Yang , Wenyang Zhao , Pengchao Sun , Wenda Zhao , Wenlong Yang
{"title":"Multi-dimensional feature fusion network design and performance optimisation for small target detection","authors":"Xiaoyao Yang , Wenyang Zhao , Pengchao Sun , Wenda Zhao , Wenlong Yang","doi":"10.1016/j.engappai.2025.112425","DOIUrl":null,"url":null,"abstract":"<div><div>Due to the long distance of image acquisition, high imaging resolution, complex feature background, shooting angle, etc. The result is that there are few features available for small targets and they are easily interfered by background noise, which poses a challenge to the detection of small targets. To address the above problems, this paper proposes a target detection network (Convolution-based Small Target Detection Network, CSTDNet) with enhanced feature information, which integrates a multi-dimensional information fusion strategy for small target features. An all-round efficient feature fusion mudule (AeFusion) is introduced, which emphasises the fusion of multi-dimensional feature information, enhances the model's ability to focus on key information and suppress redundant information, and strengthens the ability to characterise local features and details, improving the effectiveness of the information and computational efficiency. In order to further enhance the location-awareness capability in cross-layer interaction, this paper introduces a novel decoupling head (Self-aware task decomposition for fine-grained feature sharing, STFS), which improves the accuracy of the small-target classification and localisation tasks through efficient detail sharing and task auto-alignment functions. And localisation tasks through efficient detail sharing and task auto-alignment. This study evaluates the effectiveness of the algorithm on five different scenarios containing small target datasets. Experimental results show that CSTDNet achieved improvements of 6.6 %, 5.8 %, 5.8 %, 5.5 %, and 5.6 % over the baseline model in terms of the mean average precision ([email protected]) metric on the Visdrone 2019, BDD100K, WiderPerson, SODA10M, and AppleDatas datasets, respectively, demonstrating stronger detection performance.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"162 ","pages":"Article 112425"},"PeriodicalIF":8.0000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095219762502456X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Due to the long distance of image acquisition, high imaging resolution, complex feature background, shooting angle, etc. The result is that there are few features available for small targets and they are easily interfered by background noise, which poses a challenge to the detection of small targets. To address the above problems, this paper proposes a target detection network (Convolution-based Small Target Detection Network, CSTDNet) with enhanced feature information, which integrates a multi-dimensional information fusion strategy for small target features. An all-round efficient feature fusion mudule (AeFusion) is introduced, which emphasises the fusion of multi-dimensional feature information, enhances the model's ability to focus on key information and suppress redundant information, and strengthens the ability to characterise local features and details, improving the effectiveness of the information and computational efficiency. In order to further enhance the location-awareness capability in cross-layer interaction, this paper introduces a novel decoupling head (Self-aware task decomposition for fine-grained feature sharing, STFS), which improves the accuracy of the small-target classification and localisation tasks through efficient detail sharing and task auto-alignment functions. And localisation tasks through efficient detail sharing and task auto-alignment. This study evaluates the effectiveness of the algorithm on five different scenarios containing small target datasets. Experimental results show that CSTDNet achieved improvements of 6.6 %, 5.8 %, 5.8 %, 5.5 %, and 5.6 % over the baseline model in terms of the mean average precision ([email protected]) metric on the Visdrone 2019, BDD100K, WiderPerson, SODA10M, and AppleDatas datasets, respectively, demonstrating stronger detection performance.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.