IPD-YOLO：基于改进YOLO11的无人机视角红外图像人物检测

IF 2.9 3区工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC

Digital Signal Processing Pub Date : 2025-07-11 DOI:10.1016/j.dsp.2025.105469

Mengyang Li, Nan Yan

{"title":"IPD-YOLO：基于改进YOLO11的无人机视角红外图像人物检测","authors":"Mengyang Li, Nan Yan","doi":"10.1016/j.dsp.2025.105469","DOIUrl":null,"url":null,"abstract":"<div><div>The integration of UAV technology and deep learning object detection algorithms for human target detection has emerged as a prominent area in current research and application. However, practical implementation faces significant challenges under low-light conditions at night. To address this issue, this paper presents a solution based on an infrared image sensor mounted on a UAV. The proposed method employs IPD-YOLO, an improved deep learning object detection algorithm derived from YOLO11, to detect humans in drone-captured infrared images. First, the detection layer is reconfigured to better accommodate small target detection from aerial perspectives. Second, the MASRCNet feature extraction module is introduced to enhance the model's capability in extracting and fusing high- and low-dimensional features along with contextual information through a star-shaped operation structure and residual context anchors. Third, the LQEHead detection head is designed, incorporating a localization quality estimator to assess the quality of detection boxes and refine the classification branch. Finally, a novel NWD-Inner CIoU loss function is proposed, combining normalized Wasserstein distance with an inner auxiliary frame mechanism to improve the localization accuracy of small targets. Ablation experiments demonstrate that each improvement contributes effectively to overall performance: adjusting the detection layer increases mAP@50 by 4.6 percentage points and mAP@50:95 by 2.9 percentage points. Incorporating MASRCNet further improves mAP@50 by 0.6 percentage points and mAP@50:95 by 0.1 percentage points. With LQEHead, mAP@75 reaches 0.495 and mAP@50:95 increases to 0.496. The adoption of the NWD-Inner CIoU loss function boosts mAP@50 to 0.915, mAP@75 to 0.500, and mAP@50:95 to 0.501. Compared with mainstream YOLO variants such as YOLOv5n, YOLOv8n, YOLOv10n, and YOLO11n, IPD-YOLO achieves improvements of 4.7, 7.4, 6.3, and 6.4 percentage points respectively on mAP@50, and enhancements of 6.7, 5.3, 4.9, and 4.4 percentage points on mAP@50:95. Furthermore, IPD-YOLO outperforms advanced models including G-YOLO, LMANet, YOFIR, and YOLO-TSL, achieving average improvements of 3.5, 2.3, 2.8, and 3.7 percentage points on mAP@50, and 5.3, 2.1, 4.4, and 5.0 percentage points on mAP@50:95 respectively. Compared with RT-DETR, IPD-YOLO maintains high detection accuracy while significantly reducing model parameters and computational cost, thereby enhancing its feasibility for real-world deployment. These results comprehensively validate the superior performance of IPD-YOLO in human detection tasks using UAV-based infrared imagery.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105469"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IPD-YOLO: Person detection in infrared images from UAV perspective based on improved YOLO11\",\"authors\":\"Mengyang Li, Nan Yan\",\"doi\":\"10.1016/j.dsp.2025.105469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The integration of UAV technology and deep learning object detection algorithms for human target detection has emerged as a prominent area in current research and application. However, practical implementation faces significant challenges under low-light conditions at night. To address this issue, this paper presents a solution based on an infrared image sensor mounted on a UAV. The proposed method employs IPD-YOLO, an improved deep learning object detection algorithm derived from YOLO11, to detect humans in drone-captured infrared images. First, the detection layer is reconfigured to better accommodate small target detection from aerial perspectives. Second, the MASRCNet feature extraction module is introduced to enhance the model's capability in extracting and fusing high- and low-dimensional features along with contextual information through a star-shaped operation structure and residual context anchors. Third, the LQEHead detection head is designed, incorporating a localization quality estimator to assess the quality of detection boxes and refine the classification branch. Finally, a novel NWD-Inner CIoU loss function is proposed, combining normalized Wasserstein distance with an inner auxiliary frame mechanism to improve the localization accuracy of small targets. Ablation experiments demonstrate that each improvement contributes effectively to overall performance: adjusting the detection layer increases mAP@50 by 4.6 percentage points and mAP@50:95 by 2.9 percentage points. Incorporating MASRCNet further improves mAP@50 by 0.6 percentage points and mAP@50:95 by 0.1 percentage points. With LQEHead, mAP@75 reaches 0.495 and mAP@50:95 increases to 0.496. The adoption of the NWD-Inner CIoU loss function boosts mAP@50 to 0.915, mAP@75 to 0.500, and mAP@50:95 to 0.501. Compared with mainstream YOLO variants such as YOLOv5n, YOLOv8n, YOLOv10n, and YOLO11n, IPD-YOLO achieves improvements of 4.7, 7.4, 6.3, and 6.4 percentage points respectively on mAP@50, and enhancements of 6.7, 5.3, 4.9, and 4.4 percentage points on mAP@50:95. Furthermore, IPD-YOLO outperforms advanced models including G-YOLO, LMANet, YOFIR, and YOLO-TSL, achieving average improvements of 3.5, 2.3, 2.8, and 3.7 percentage points on mAP@50, and 5.3, 2.1, 4.4, and 5.0 percentage points on mAP@50:95 respectively. Compared with RT-DETR, IPD-YOLO maintains high detection accuracy while significantly reducing model parameters and computational cost, thereby enhancing its feasibility for real-world deployment. These results comprehensively validate the superior performance of IPD-YOLO in human detection tasks using UAV-based infrared imagery.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"168 \",\"pages\":\"Article 105469\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425004919\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425004919","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}

引用次数: 0

摘要

将无人机技术与深度学习目标检测算法相结合用于人体目标检测已成为当前研究和应用的一个突出领域。然而，在夜间低光条件下，实际实施面临着重大挑战。针对这一问题，本文提出了一种基于无人机红外图像传感器的解决方案。该方法采用基于YOLO11的改进深度学习目标检测算法IPD-YOLO，对无人机拍摄的红外图像进行人体检测。首先，对检测层进行重新配置，以更好地适应空中视角的小目标检测。其次，引入MASRCNet特征提取模块，通过星形运算结构和残余上下文锚点，增强模型对高低维特征和上下文信息的提取和融合能力。第三，设计LQEHead检测头，结合定位质量估计器对检测盒质量进行评估，细化分类分支；最后，提出了一种新的NWD-Inner CIoU损失函数，将归一化Wasserstein距离与内部辅助框架机制相结合，提高了小目标的定位精度。消融实验表明，每一项改进都有效地提高了整体性能：调整检测层可提高mAP@50 4.6个百分点，mAP@50:95个百分点，提高2.9个百分点。纳入MASRCNet后，mAP@50提高了0.6个百分点，mAP@50:95提高了0.1个百分点。随着LQEHead， mAP@75达到0.495，mAP@50:95增加到0.496。采用NWD-Inner CIoU损失函数，使mAP@50为0.915，mAP@75为0.500，mAP@50:95为0.501。与YOLOv5n、YOLOv8n、YOLOv10n、YOLO11n等主流YOLO变体相比，IPD-YOLO在mAP@50上分别提高了4.7、7.4、6.3、6.4个百分点，在mAP@50:95上分别提高了6.7、5.3、4.9、4.4个百分点。此外，IPD-YOLO优于G-YOLO、LMANet、YOFIR和YOLO-TSL等先进模型，在mAP@50上平均提高3.5、2.3、2.8和3.7个百分点，在mAP@50:95上平均提高5.3、2.1、4.4和5.0个百分点。与RT-DETR相比，IPD-YOLO在保持较高检测精度的同时，显著降低了模型参数和计算成本，增强了实际部署的可行性。这些结果综合验证了IPD-YOLO在基于无人机红外图像的人体检测任务中的优越性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

IPD-YOLO: Person detection in infrared images from UAV perspective based on improved YOLO11

查看原文本刊更多论文

IPD-YOLO: Person detection in infrared images from UAV perspective based on improved YOLO11

The integration of UAV technology and deep learning object detection algorithms for human target detection has emerged as a prominent area in current research and application. However, practical implementation faces significant challenges under low-light conditions at night. To address this issue, this paper presents a solution based on an infrared image sensor mounted on a UAV. The proposed method employs IPD-YOLO, an improved deep learning object detection algorithm derived from YOLO11, to detect humans in drone-captured infrared images. First, the detection layer is reconfigured to better accommodate small target detection from aerial perspectives. Second, the MASRCNet feature extraction module is introduced to enhance the model's capability in extracting and fusing high- and low-dimensional features along with contextual information through a star-shaped operation structure and residual context anchors. Third, the LQEHead detection head is designed, incorporating a localization quality estimator to assess the quality of detection boxes and refine the classification branch. Finally, a novel NWD-Inner CIoU loss function is proposed, combining normalized Wasserstein distance with an inner auxiliary frame mechanism to improve the localization accuracy of small targets. Ablation experiments demonstrate that each improvement contributes effectively to overall performance: adjusting the detection layer increases mAP@50 by 4.6 percentage points and mAP@50:95 by 2.9 percentage points. Incorporating MASRCNet further improves mAP@50 by 0.6 percentage points and mAP@50:95 by 0.1 percentage points. With LQEHead, mAP@75 reaches 0.495 and mAP@50:95 increases to 0.496. The adoption of the NWD-Inner CIoU loss function boosts mAP@50 to 0.915, mAP@75 to 0.500, and mAP@50:95 to 0.501. Compared with mainstream YOLO variants such as YOLOv5n, YOLOv8n, YOLOv10n, and YOLO11n, IPD-YOLO achieves improvements of 4.7, 7.4, 6.3, and 6.4 percentage points respectively on mAP@50, and enhancements of 6.7, 5.3, 4.9, and 4.4 percentage points on mAP@50:95. Furthermore, IPD-YOLO outperforms advanced models including G-YOLO, LMANet, YOFIR, and YOLO-TSL, achieving average improvements of 3.5, 2.3, 2.8, and 3.7 percentage points on mAP@50, and 5.3, 2.1, 4.4, and 5.0 percentage points on mAP@50:95 respectively. Compared with RT-DETR, IPD-YOLO maintains high detection accuracy while significantly reducing model parameters and computational cost, thereby enhancing its feasibility for real-world deployment. These results comprehensively validate the superior performance of IPD-YOLO in human detection tasks using UAV-based infrared imagery.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Digital Signal Processing 工程技术-工程：电子与电气

CiteScore

5.30

自引率

17.20%

发文量

435

审稿时长

66 days

期刊介绍： Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,