Lei Cao , Qing Wang , Yunhui Luo , Yongjie Hou , Wanglin Zheng , Haiming Qu
{"title":"基于yolov8的不同视角红外图像轻量化检测模型","authors":"Lei Cao , Qing Wang , Yunhui Luo , Yongjie Hou , Wanglin Zheng , Haiming Qu","doi":"10.1016/j.optcom.2025.131612","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, the development of edge devices has enabled the capture and detection of infrared images from different perspectives. However, infrared image detection from different perspectives often results in low accuracy, massive model parameters, and high computing burden. To address these challenges, we propose an enhanced lightweight infrared target detection method leveraging YOLOv8n. Our method adopts the ADown downsampling operation in the model’s backbone, which gradually reduces the spatial size of the feature map while increasing its depth. This operation enables the network to capture and process information at various scales more effectively, enhancing its feature representation capabilities. Additionally, our method incorporates the Triplet Attention mechanism to enhance the effectiveness of feature extraction. Finally, to optimize the feature pyramid and path aggregation network, we propose the ASF-DynamicScalSeq structure, utilizing GSConv, VoVGSCSP to balance accuracy and processing speed. The ASF-ScalSeg optimizes spatial-scale features, while the DySample dynamic upsampling function enhances performance without compromising efficiency. In infrared datasets, including those from UAV and vehicle perspectives, our algorithm shows a 2.9% and 1% increase in mAP50, respectively, compared to YOLOv8n. Additionally, our approach reduces parameters by 21.9%, decreases computational burden by 13.4%, and enhances inference speed by 3.9% and 7.8%, respectively. To validate the effectiveness and robustness of our method, we conduct experiments on FLIR and aerial infrared datasets. The results show a 0.9% and 1% increase in mAP50, a 21.9% reduction in parameters, a 13.4% decrease in computational burden, and a 5.9% and 7.8% improvement in inference speed, respectively, compared to YOLOv8n.</div></div>","PeriodicalId":19586,"journal":{"name":"Optics Communications","volume":"582 ","pages":"Article 131612"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A yolov8-based lightweight detection model for different perspectives infrared images\",\"authors\":\"Lei Cao , Qing Wang , Yunhui Luo , Yongjie Hou , Wanglin Zheng , Haiming Qu\",\"doi\":\"10.1016/j.optcom.2025.131612\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In recent years, the development of edge devices has enabled the capture and detection of infrared images from different perspectives. However, infrared image detection from different perspectives often results in low accuracy, massive model parameters, and high computing burden. To address these challenges, we propose an enhanced lightweight infrared target detection method leveraging YOLOv8n. Our method adopts the ADown downsampling operation in the model’s backbone, which gradually reduces the spatial size of the feature map while increasing its depth. This operation enables the network to capture and process information at various scales more effectively, enhancing its feature representation capabilities. Additionally, our method incorporates the Triplet Attention mechanism to enhance the effectiveness of feature extraction. Finally, to optimize the feature pyramid and path aggregation network, we propose the ASF-DynamicScalSeq structure, utilizing GSConv, VoVGSCSP to balance accuracy and processing speed. The ASF-ScalSeg optimizes spatial-scale features, while the DySample dynamic upsampling function enhances performance without compromising efficiency. In infrared datasets, including those from UAV and vehicle perspectives, our algorithm shows a 2.9% and 1% increase in mAP50, respectively, compared to YOLOv8n. Additionally, our approach reduces parameters by 21.9%, decreases computational burden by 13.4%, and enhances inference speed by 3.9% and 7.8%, respectively. To validate the effectiveness and robustness of our method, we conduct experiments on FLIR and aerial infrared datasets. The results show a 0.9% and 1% increase in mAP50, a 21.9% reduction in parameters, a 13.4% decrease in computational burden, and a 5.9% and 7.8% improvement in inference speed, respectively, compared to YOLOv8n.</div></div>\",\"PeriodicalId\":19586,\"journal\":{\"name\":\"Optics Communications\",\"volume\":\"582 \",\"pages\":\"Article 131612\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optics Communications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0030401825001403\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0030401825001403","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}
A yolov8-based lightweight detection model for different perspectives infrared images
In recent years, the development of edge devices has enabled the capture and detection of infrared images from different perspectives. However, infrared image detection from different perspectives often results in low accuracy, massive model parameters, and high computing burden. To address these challenges, we propose an enhanced lightweight infrared target detection method leveraging YOLOv8n. Our method adopts the ADown downsampling operation in the model’s backbone, which gradually reduces the spatial size of the feature map while increasing its depth. This operation enables the network to capture and process information at various scales more effectively, enhancing its feature representation capabilities. Additionally, our method incorporates the Triplet Attention mechanism to enhance the effectiveness of feature extraction. Finally, to optimize the feature pyramid and path aggregation network, we propose the ASF-DynamicScalSeq structure, utilizing GSConv, VoVGSCSP to balance accuracy and processing speed. The ASF-ScalSeg optimizes spatial-scale features, while the DySample dynamic upsampling function enhances performance without compromising efficiency. In infrared datasets, including those from UAV and vehicle perspectives, our algorithm shows a 2.9% and 1% increase in mAP50, respectively, compared to YOLOv8n. Additionally, our approach reduces parameters by 21.9%, decreases computational burden by 13.4%, and enhances inference speed by 3.9% and 7.8%, respectively. To validate the effectiveness and robustness of our method, we conduct experiments on FLIR and aerial infrared datasets. The results show a 0.9% and 1% increase in mAP50, a 21.9% reduction in parameters, a 13.4% decrease in computational burden, and a 5.9% and 7.8% improvement in inference speed, respectively, compared to YOLOv8n.
期刊介绍:
Optics Communications invites original and timely contributions containing new results in various fields of optics and photonics. The journal considers theoretical and experimental research in areas ranging from the fundamental properties of light to technological applications. Topics covered include classical and quantum optics, optical physics and light-matter interactions, lasers, imaging, guided-wave optics and optical information processing. Manuscripts should offer clear evidence of novelty and significance. Papers concentrating on mathematical and computational issues, with limited connection to optics, are not suitable for publication in the Journal. Similarly, small technical advances, or papers concerned only with engineering applications or issues of materials science fall outside the journal scope.