Muhammad Shahroze Ali , Afshan Latif , Muhammad Waseem Anwar , Muhammad Hashir Ashraf
{"title":"基于多尺度自关注的无人机红外热图像检测","authors":"Muhammad Shahroze Ali , Afshan Latif , Muhammad Waseem Anwar , Muhammad Hashir Ashraf","doi":"10.1016/j.engappai.2025.110488","DOIUrl":null,"url":null,"abstract":"<div><div>Object detection and recognition in unmanned aerial vehicle-based images is critical for various applications but is often challenged by complex backgrounds, diverse object scales, densely clustered small objects, and uneven object distributions. This paper introduces a novel deep learning-based artificial intelligence framework that integrates the Multiscale Self-Attention Guidance and Feature Fusion Network with the You Only Look Once model, tailored explicitly for artificial intelligence-driven unmanned aerial vehicle-based infrared thermal image analysis. The proposed methodology offers four key advancements in the You Only Look Once architecture to enhance object detection performance. First, the Multi-Head Self-Attention Transformer module combines global and local information, enabling precise object localization while mitigating the influence of complex backgrounds. Second, the Multiscale Parallel Sampling Feature Fusion module optimizes the fusion of multiscale features. Third, fine-grained shallow feature maps are integrated into the fusion process to detect densely packed small objects accurately. Lastly, the Inverse-Residual Feature Enhancement module, positioned before the detection head, enhances feature extraction for small objects. Experimental evaluations on the High Altitude Infrared Thermal Unmanned Aerial Vehicle dataset demonstrate significant improvements, achieving a Mean Average Precision of 95.1%, Recall of 92.0%, and F1-Score of 91.0%. The framework’s robustness is further validated on the Wildland-fire Infrared Thermal Unmanned Aerial System dataset, achieving a Mean Average Precision of 82.1%, Recall of 88.0%, and F1-Score of 82.0%. Comparative analyses with state-of-the-art methods confirm its superiority and offer a scalable artificial intelligence-driven solution for unmanned aerial vehicle applications, advancing object detection capabilities in critical scenarios.</div></div>","PeriodicalId":50523,"journal":{"name":"Engineering Applications of Artificial Intelligence","volume":"149 ","pages":"Article 110488"},"PeriodicalIF":8.0000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multiscale self-attention for unmanned ariel vehicle-based infrared thermal images detection\",\"authors\":\"Muhammad Shahroze Ali , Afshan Latif , Muhammad Waseem Anwar , Muhammad Hashir Ashraf\",\"doi\":\"10.1016/j.engappai.2025.110488\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Object detection and recognition in unmanned aerial vehicle-based images is critical for various applications but is often challenged by complex backgrounds, diverse object scales, densely clustered small objects, and uneven object distributions. This paper introduces a novel deep learning-based artificial intelligence framework that integrates the Multiscale Self-Attention Guidance and Feature Fusion Network with the You Only Look Once model, tailored explicitly for artificial intelligence-driven unmanned aerial vehicle-based infrared thermal image analysis. The proposed methodology offers four key advancements in the You Only Look Once architecture to enhance object detection performance. First, the Multi-Head Self-Attention Transformer module combines global and local information, enabling precise object localization while mitigating the influence of complex backgrounds. Second, the Multiscale Parallel Sampling Feature Fusion module optimizes the fusion of multiscale features. Third, fine-grained shallow feature maps are integrated into the fusion process to detect densely packed small objects accurately. Lastly, the Inverse-Residual Feature Enhancement module, positioned before the detection head, enhances feature extraction for small objects. Experimental evaluations on the High Altitude Infrared Thermal Unmanned Aerial Vehicle dataset demonstrate significant improvements, achieving a Mean Average Precision of 95.1%, Recall of 92.0%, and F1-Score of 91.0%. The framework’s robustness is further validated on the Wildland-fire Infrared Thermal Unmanned Aerial System dataset, achieving a Mean Average Precision of 82.1%, Recall of 88.0%, and F1-Score of 82.0%. Comparative analyses with state-of-the-art methods confirm its superiority and offer a scalable artificial intelligence-driven solution for unmanned aerial vehicle applications, advancing object detection capabilities in critical scenarios.</div></div>\",\"PeriodicalId\":50523,\"journal\":{\"name\":\"Engineering Applications of Artificial Intelligence\",\"volume\":\"149 \",\"pages\":\"Article 110488\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-03-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Engineering Applications of Artificial Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0952197625004889\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering Applications of Artificial Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952197625004889","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
基于无人机图像的目标检测和识别对于各种应用至关重要,但经常受到复杂背景、不同目标尺度、密集聚集的小目标和不均匀目标分布的挑战。本文介绍了一种新的基于深度学习的人工智能框架,该框架将多尺度自关注引导和特征融合网络与You Only Look Once模型集成在一起,专门为人工智能驱动的无人机红外热图像分析量身定制。提出的方法在You Only Look Once架构中提供了四个关键的改进,以提高目标检测性能。首先,多头自关注转换器模块结合了全局和局部信息,在减轻复杂背景影响的同时实现了精确的目标定位。其次,多尺度并行采样特征融合模块优化了多尺度特征融合。第三,将细粒度的浅层特征映射融合到融合过程中,精确检测密集堆积的小物体。最后,位于检测头前的逆残差特征增强模块增强了对小目标的特征提取。在高空红外热无人机数据集上的实验评估表明,该方法取得了显著的改进,平均精度为95.1%,召回率为92.0%,F1-Score为91.0%。在野火红外热无人机系统数据集上进一步验证了该框架的鲁棒性,平均精度为82.1%,召回率为88.0%,F1-Score为82.0%。与最先进的方法进行比较分析,证实了其优越性,并为无人机应用提供了可扩展的人工智能驱动解决方案,提高了关键场景下的目标检测能力。
Multiscale self-attention for unmanned ariel vehicle-based infrared thermal images detection
Object detection and recognition in unmanned aerial vehicle-based images is critical for various applications but is often challenged by complex backgrounds, diverse object scales, densely clustered small objects, and uneven object distributions. This paper introduces a novel deep learning-based artificial intelligence framework that integrates the Multiscale Self-Attention Guidance and Feature Fusion Network with the You Only Look Once model, tailored explicitly for artificial intelligence-driven unmanned aerial vehicle-based infrared thermal image analysis. The proposed methodology offers four key advancements in the You Only Look Once architecture to enhance object detection performance. First, the Multi-Head Self-Attention Transformer module combines global and local information, enabling precise object localization while mitigating the influence of complex backgrounds. Second, the Multiscale Parallel Sampling Feature Fusion module optimizes the fusion of multiscale features. Third, fine-grained shallow feature maps are integrated into the fusion process to detect densely packed small objects accurately. Lastly, the Inverse-Residual Feature Enhancement module, positioned before the detection head, enhances feature extraction for small objects. Experimental evaluations on the High Altitude Infrared Thermal Unmanned Aerial Vehicle dataset demonstrate significant improvements, achieving a Mean Average Precision of 95.1%, Recall of 92.0%, and F1-Score of 91.0%. The framework’s robustness is further validated on the Wildland-fire Infrared Thermal Unmanned Aerial System dataset, achieving a Mean Average Precision of 82.1%, Recall of 88.0%, and F1-Score of 82.0%. Comparative analyses with state-of-the-art methods confirm its superiority and offer a scalable artificial intelligence-driven solution for unmanned aerial vehicle applications, advancing object detection capabilities in critical scenarios.
期刊介绍:
Artificial Intelligence (AI) is pivotal in driving the fourth industrial revolution, witnessing remarkable advancements across various machine learning methodologies. AI techniques have become indispensable tools for practicing engineers, enabling them to tackle previously insurmountable challenges. Engineering Applications of Artificial Intelligence serves as a global platform for the swift dissemination of research elucidating the practical application of AI methods across all engineering disciplines. Submitted papers are expected to present novel aspects of AI utilized in real-world engineering applications, validated using publicly available datasets to ensure the replicability of research outcomes. Join us in exploring the transformative potential of AI in engineering.