{"title":"IPD-YOLO:基于改进YOLO11的无人机视角红外图像人物检测","authors":"Mengyang Li, Nan Yan","doi":"10.1016/j.dsp.2025.105469","DOIUrl":null,"url":null,"abstract":"<div><div>The integration of UAV technology and deep learning object detection algorithms for human target detection has emerged as a prominent area in current research and application. However, practical implementation faces significant challenges under low-light conditions at night. To address this issue, this paper presents a solution based on an infrared image sensor mounted on a UAV. The proposed method employs IPD-YOLO, an improved deep learning object detection algorithm derived from YOLO11, to detect humans in drone-captured infrared images. First, the detection layer is reconfigured to better accommodate small target detection from aerial perspectives. Second, the MASRCNet feature extraction module is introduced to enhance the model's capability in extracting and fusing high- and low-dimensional features along with contextual information through a star-shaped operation structure and residual context anchors. Third, the LQEHead detection head is designed, incorporating a localization quality estimator to assess the quality of detection boxes and refine the classification branch. Finally, a novel NWD-Inner CIoU loss function is proposed, combining normalized Wasserstein distance with an inner auxiliary frame mechanism to improve the localization accuracy of small targets. Ablation experiments demonstrate that each improvement contributes effectively to overall performance: adjusting the detection layer increases mAP@50 by 4.6 percentage points and mAP@50:95 by 2.9 percentage points. Incorporating MASRCNet further improves mAP@50 by 0.6 percentage points and mAP@50:95 by 0.1 percentage points. With LQEHead, mAP@75 reaches 0.495 and mAP@50:95 increases to 0.496. The adoption of the NWD-Inner CIoU loss function boosts mAP@50 to 0.915, mAP@75 to 0.500, and mAP@50:95 to 0.501. Compared with mainstream YOLO variants such as YOLOv5n, YOLOv8n, YOLOv10n, and YOLO11n, IPD-YOLO achieves improvements of 4.7, 7.4, 6.3, and 6.4 percentage points respectively on mAP@50, and enhancements of 6.7, 5.3, 4.9, and 4.4 percentage points on mAP@50:95. Furthermore, IPD-YOLO outperforms advanced models including G-YOLO, LMANet, YOFIR, and YOLO-TSL, achieving average improvements of 3.5, 2.3, 2.8, and 3.7 percentage points on mAP@50, and 5.3, 2.1, 4.4, and 5.0 percentage points on mAP@50:95 respectively. Compared with RT-DETR, IPD-YOLO maintains high detection accuracy while significantly reducing model parameters and computational cost, thereby enhancing its feasibility for real-world deployment. These results comprehensively validate the superior performance of IPD-YOLO in human detection tasks using UAV-based infrared imagery.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"168 ","pages":"Article 105469"},"PeriodicalIF":2.9000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IPD-YOLO: Person detection in infrared images from UAV perspective based on improved YOLO11\",\"authors\":\"Mengyang Li, Nan Yan\",\"doi\":\"10.1016/j.dsp.2025.105469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>The integration of UAV technology and deep learning object detection algorithms for human target detection has emerged as a prominent area in current research and application. However, practical implementation faces significant challenges under low-light conditions at night. To address this issue, this paper presents a solution based on an infrared image sensor mounted on a UAV. The proposed method employs IPD-YOLO, an improved deep learning object detection algorithm derived from YOLO11, to detect humans in drone-captured infrared images. First, the detection layer is reconfigured to better accommodate small target detection from aerial perspectives. Second, the MASRCNet feature extraction module is introduced to enhance the model's capability in extracting and fusing high- and low-dimensional features along with contextual information through a star-shaped operation structure and residual context anchors. Third, the LQEHead detection head is designed, incorporating a localization quality estimator to assess the quality of detection boxes and refine the classification branch. Finally, a novel NWD-Inner CIoU loss function is proposed, combining normalized Wasserstein distance with an inner auxiliary frame mechanism to improve the localization accuracy of small targets. Ablation experiments demonstrate that each improvement contributes effectively to overall performance: adjusting the detection layer increases mAP@50 by 4.6 percentage points and mAP@50:95 by 2.9 percentage points. Incorporating MASRCNet further improves mAP@50 by 0.6 percentage points and mAP@50:95 by 0.1 percentage points. With LQEHead, mAP@75 reaches 0.495 and mAP@50:95 increases to 0.496. The adoption of the NWD-Inner CIoU loss function boosts mAP@50 to 0.915, mAP@75 to 0.500, and mAP@50:95 to 0.501. Compared with mainstream YOLO variants such as YOLOv5n, YOLOv8n, YOLOv10n, and YOLO11n, IPD-YOLO achieves improvements of 4.7, 7.4, 6.3, and 6.4 percentage points respectively on mAP@50, and enhancements of 6.7, 5.3, 4.9, and 4.4 percentage points on mAP@50:95. Furthermore, IPD-YOLO outperforms advanced models including G-YOLO, LMANet, YOFIR, and YOLO-TSL, achieving average improvements of 3.5, 2.3, 2.8, and 3.7 percentage points on mAP@50, and 5.3, 2.1, 4.4, and 5.0 percentage points on mAP@50:95 respectively. Compared with RT-DETR, IPD-YOLO maintains high detection accuracy while significantly reducing model parameters and computational cost, thereby enhancing its feasibility for real-world deployment. These results comprehensively validate the superior performance of IPD-YOLO in human detection tasks using UAV-based infrared imagery.</div></div>\",\"PeriodicalId\":51011,\"journal\":{\"name\":\"Digital Signal Processing\",\"volume\":\"168 \",\"pages\":\"Article 105469\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2025-07-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Signal Processing\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1051200425004919\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, ELECTRICAL & ELECTRONIC\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425004919","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
IPD-YOLO: Person detection in infrared images from UAV perspective based on improved YOLO11
The integration of UAV technology and deep learning object detection algorithms for human target detection has emerged as a prominent area in current research and application. However, practical implementation faces significant challenges under low-light conditions at night. To address this issue, this paper presents a solution based on an infrared image sensor mounted on a UAV. The proposed method employs IPD-YOLO, an improved deep learning object detection algorithm derived from YOLO11, to detect humans in drone-captured infrared images. First, the detection layer is reconfigured to better accommodate small target detection from aerial perspectives. Second, the MASRCNet feature extraction module is introduced to enhance the model's capability in extracting and fusing high- and low-dimensional features along with contextual information through a star-shaped operation structure and residual context anchors. Third, the LQEHead detection head is designed, incorporating a localization quality estimator to assess the quality of detection boxes and refine the classification branch. Finally, a novel NWD-Inner CIoU loss function is proposed, combining normalized Wasserstein distance with an inner auxiliary frame mechanism to improve the localization accuracy of small targets. Ablation experiments demonstrate that each improvement contributes effectively to overall performance: adjusting the detection layer increases mAP@50 by 4.6 percentage points and mAP@50:95 by 2.9 percentage points. Incorporating MASRCNet further improves mAP@50 by 0.6 percentage points and mAP@50:95 by 0.1 percentage points. With LQEHead, mAP@75 reaches 0.495 and mAP@50:95 increases to 0.496. The adoption of the NWD-Inner CIoU loss function boosts mAP@50 to 0.915, mAP@75 to 0.500, and mAP@50:95 to 0.501. Compared with mainstream YOLO variants such as YOLOv5n, YOLOv8n, YOLOv10n, and YOLO11n, IPD-YOLO achieves improvements of 4.7, 7.4, 6.3, and 6.4 percentage points respectively on mAP@50, and enhancements of 6.7, 5.3, 4.9, and 4.4 percentage points on mAP@50:95. Furthermore, IPD-YOLO outperforms advanced models including G-YOLO, LMANet, YOFIR, and YOLO-TSL, achieving average improvements of 3.5, 2.3, 2.8, and 3.7 percentage points on mAP@50, and 5.3, 2.1, 4.4, and 5.0 percentage points on mAP@50:95 respectively. Compared with RT-DETR, IPD-YOLO maintains high detection accuracy while significantly reducing model parameters and computational cost, thereby enhancing its feasibility for real-world deployment. These results comprehensively validate the superior performance of IPD-YOLO in human detection tasks using UAV-based infrared imagery.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,