Xiaoyao Yang , Wenyang Zhao , Pengchao Sun , Wenda Zhao , Wenlong Yang
{"title":"面向小目标检测的多维特征融合网络设计与性能优化","authors":"Xiaoyao Yang , Wenyang Zhao , Pengchao Sun , Wenda Zhao , Wenlong Yang","doi":"10.1016/j.engappai.2025.112425","DOIUrl":null,"url":null,"abstract":"<div><div>Due to the long distance of image acquisition, high imaging resolution, complex feature background, shooting angle, etc. The result is that there are few features available for small targets and they are easily interfered by background noise, which poses a challenge to the detection of small targets. To address the above problems, this paper proposes a target detection network (Convolution-based Small Target Detection Network, CSTDNet) with enhanced feature information, which integrates a multi-dimensional information fusion strategy for small target features. An all-round efficient feature fusion mudule (AeFusion) is introduced, which emphasises the fusion of multi-dimensional feature information, enhances the model's ability to focus on key information and suppress redundant information, and strengthens the ability to characterise local features and details, improving the effectiveness of the information and computational efficiency. In order to further enhance the location-awareness capability in cross-layer interaction, this paper introduces a novel decoupling head (Self-aware task decomposition for fine-grained feature sharing, STFS), which improves the accuracy of the small-target classification and localisation tasks through efficient detail sharing and task auto-alignment functions. And localisation tasks through efficient detail sharing and task auto-alignment. This study evaluates the effectiveness of the algorithm on five different scenarios containing small target datasets. Experimental results show that CSTDNet achieved improvements of 6.6 %, 5.8 %, 5.8 %, 5.5 %, and 5.6 % over the baseline model in terms of the mean average precision ([email protected]) metric on the Visdrone 2019, BDD100K, WiderPerson, SODA10M, and AppleDatas datasets, respectively, demonstrating stronger detection performance.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"162 ","pages":"Article 112425"},"PeriodicalIF":8.0000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-dimensional feature fusion network design and performance optimisation for small target detection\",\"authors\":\"Xiaoyao Yang , Wenyang Zhao , Pengchao Sun , Wenda Zhao , Wenlong Yang\",\"doi\":\"10.1016/j.engappai.2025.112425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Due to the long distance of image acquisition, high imaging resolution, complex feature background, shooting angle, etc. The result is that there are few features available for small targets and they are easily interfered by background noise, which poses a challenge to the detection of small targets. To address the above problems, this paper proposes a target detection network (Convolution-based Small Target Detection Network, CSTDNet) with enhanced feature information, which integrates a multi-dimensional information fusion strategy for small target features. An all-round efficient feature fusion mudule (AeFusion) is introduced, which emphasises the fusion of multi-dimensional feature information, enhances the model's ability to focus on key information and suppress redundant information, and strengthens the ability to characterise local features and details, improving the effectiveness of the information and computational efficiency. In order to further enhance the location-awareness capability in cross-layer interaction, this paper introduces a novel decoupling head (Self-aware task decomposition for fine-grained feature sharing, STFS), which improves the accuracy of the small-target classification and localisation tasks through efficient detail sharing and task auto-alignment functions. And localisation tasks through efficient detail sharing and task auto-alignment. This study evaluates the effectiveness of the algorithm on five different scenarios containing small target datasets. Experimental results show that CSTDNet achieved improvements of 6.6 %, 5.8 %, 5.8 %, 5.5 %, and 5.6 % over the baseline model in terms of the mean average precision ([email protected]) metric on the Visdrone 2019, BDD100K, WiderPerson, SODA10M, and AppleDatas datasets, respectively, demonstrating stronger detection performance.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"162 \",\"pages\":\"Article 112425\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-09-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S095219762502456X\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S095219762502456X","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
由于图像采集距离远、成像分辨率高、特征背景、拍摄角度复杂等特点。结果表明,小目标可利用的特征很少,且容易受到背景噪声的干扰,这给小目标的检测带来了挑战。针对上述问题,本文提出了一种增强特征信息的目标检测网络(基于卷积的小目标检测网络,CSTDNet),该网络集成了针对小目标特征的多维信息融合策略。提出了一种全面高效的特征融合模块(AeFusion),强调了多维特征信息的融合,增强了模型对关键信息的聚焦和冗余信息的抑制能力,增强了局部特征和细节的刻画能力,提高了信息的有效性和计算效率。为了进一步增强跨层交互中的位置感知能力,本文引入了一种新的解耦头(Self-aware task decomposition for fine-grained feature sharing, STFS),通过高效的细节共享和任务自动对齐功能,提高了小目标分类和定位任务的准确性。并通过高效的细节共享和任务自动对齐来定位任务。本研究评估了该算法在包含小目标数据集的五种不同场景下的有效性。实验结果表明,CSTDNet在Visdrone 2019、BDD100K、WiderPerson、SODA10M和appledata数据集上的平均精度([email protected])指标分别比基线模型提高了6.6%、5.8%、5.8%、5.5%和5.6%,显示出更强的检测性能。
Multi-dimensional feature fusion network design and performance optimisation for small target detection
Due to the long distance of image acquisition, high imaging resolution, complex feature background, shooting angle, etc. The result is that there are few features available for small targets and they are easily interfered by background noise, which poses a challenge to the detection of small targets. To address the above problems, this paper proposes a target detection network (Convolution-based Small Target Detection Network, CSTDNet) with enhanced feature information, which integrates a multi-dimensional information fusion strategy for small target features. An all-round efficient feature fusion mudule (AeFusion) is introduced, which emphasises the fusion of multi-dimensional feature information, enhances the model's ability to focus on key information and suppress redundant information, and strengthens the ability to characterise local features and details, improving the effectiveness of the information and computational efficiency. In order to further enhance the location-awareness capability in cross-layer interaction, this paper introduces a novel decoupling head (Self-aware task decomposition for fine-grained feature sharing, STFS), which improves the accuracy of the small-target classification and localisation tasks through efficient detail sharing and task auto-alignment functions. And localisation tasks through efficient detail sharing and task auto-alignment. This study evaluates the effectiveness of the algorithm on five different scenarios containing small target datasets. Experimental results show that CSTDNet achieved improvements of 6.6 %, 5.8 %, 5.8 %, 5.5 %, and 5.6 % over the baseline model in terms of the mean average precision ([email protected]) metric on the Visdrone 2019, BDD100K, WiderPerson, SODA10M, and AppleDatas datasets, respectively, demonstrating stronger detection performance.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.