基于yolov8的不同视角红外图像轻量化检测模型

IF 2.5 3区物理与天体物理 Q2 OPTICS

Optics Communications Pub Date : 2025-02-19 DOI:10.1016/j.optcom.2025.131612

Lei Cao , Qing Wang , Yunhui Luo , Yongjie Hou , Wanglin Zheng , Haiming Qu

{"title":"基于yolov8的不同视角红外图像轻量化检测模型","authors":"Lei Cao , Qing Wang , Yunhui Luo , Yongjie Hou , Wanglin Zheng , Haiming Qu","doi":"10.1016/j.optcom.2025.131612","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, the development of edge devices has enabled the capture and detection of infrared images from different perspectives. However, infrared image detection from different perspectives often results in low accuracy, massive model parameters, and high computing burden. To address these challenges, we propose an enhanced lightweight infrared target detection method leveraging YOLOv8n. Our method adopts the ADown downsampling operation in the model’s backbone, which gradually reduces the spatial size of the feature map while increasing its depth. This operation enables the network to capture and process information at various scales more effectively, enhancing its feature representation capabilities. Additionally, our method incorporates the Triplet Attention mechanism to enhance the effectiveness of feature extraction. Finally, to optimize the feature pyramid and path aggregation network, we propose the ASF-DynamicScalSeq structure, utilizing GSConv, VoVGSCSP to balance accuracy and processing speed. The ASF-ScalSeg optimizes spatial-scale features, while the DySample dynamic upsampling function enhances performance without compromising efficiency. In infrared datasets, including those from UAV and vehicle perspectives, our algorithm shows a 2.9% and 1% increase in mAP50, respectively, compared to YOLOv8n. Additionally, our approach reduces parameters by 21.9%, decreases computational burden by 13.4%, and enhances inference speed by 3.9% and 7.8%, respectively. To validate the effectiveness and robustness of our method, we conduct experiments on FLIR and aerial infrared datasets. The results show a 0.9% and 1% increase in mAP50, a 21.9% reduction in parameters, a 13.4% decrease in computational burden, and a 5.9% and 7.8% improvement in inference speed, respectively, compared to YOLOv8n.</div></div>","PeriodicalId":19586,"journal":{"name":"Optics Communications","volume":"582 ","pages":"Article 131612"},"PeriodicalIF":2.5000,"publicationDate":"2025-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A yolov8-based lightweight detection model for different perspectives infrared images\",\"authors\":\"Lei Cao , Qing Wang , Yunhui Luo , Yongjie Hou , Wanglin Zheng , Haiming Qu\",\"doi\":\"10.1016/j.optcom.2025.131612\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In recent years, the development of edge devices has enabled the capture and detection of infrared images from different perspectives. However, infrared image detection from different perspectives often results in low accuracy, massive model parameters, and high computing burden. To address these challenges, we propose an enhanced lightweight infrared target detection method leveraging YOLOv8n. Our method adopts the ADown downsampling operation in the model’s backbone, which gradually reduces the spatial size of the feature map while increasing its depth. This operation enables the network to capture and process information at various scales more effectively, enhancing its feature representation capabilities. Additionally, our method incorporates the Triplet Attention mechanism to enhance the effectiveness of feature extraction. Finally, to optimize the feature pyramid and path aggregation network, we propose the ASF-DynamicScalSeq structure, utilizing GSConv, VoVGSCSP to balance accuracy and processing speed. The ASF-ScalSeg optimizes spatial-scale features, while the DySample dynamic upsampling function enhances performance without compromising efficiency. In infrared datasets, including those from UAV and vehicle perspectives, our algorithm shows a 2.9% and 1% increase in mAP50, respectively, compared to YOLOv8n. Additionally, our approach reduces parameters by 21.9%, decreases computational burden by 13.4%, and enhances inference speed by 3.9% and 7.8%, respectively. To validate the effectiveness and robustness of our method, we conduct experiments on FLIR and aerial infrared datasets. The results show a 0.9% and 1% increase in mAP50, a 21.9% reduction in parameters, a 13.4% decrease in computational burden, and a 5.9% and 7.8% improvement in inference speed, respectively, compared to YOLOv8n.</div></div>\",\"PeriodicalId\":19586,\"journal\":{\"name\":\"Optics Communications\",\"volume\":\"582 \",\"pages\":\"Article 131612\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2025-02-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Optics Communications\",\"FirstCategoryId\":\"101\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0030401825001403\",\"RegionNum\":3,\"RegionCategory\":\"物理与天体物理\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"OPTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0030401825001403","RegionNum":3,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 0

摘要

近年来，边缘设备的发展使得从不同角度捕获和检测红外图像成为可能。然而，不同角度的红外图像检测往往精度低，模型参数量大，计算量大。为了解决这些挑战，我们提出了一种利用YOLOv8n的增强型轻型红外目标检测方法。我们的方法在模型的主干部分采用down下采样操作，逐步减小特征图的空间大小，同时增加特征图的深度。该操作使网络能够更有效地捕获和处理各种尺度的信息，增强其特征表示能力。此外，我们的方法结合了三重注意机制，以提高特征提取的有效性。最后，为了优化特征金字塔和路径聚合网络，我们提出了ASF-DynamicScalSeq结构，利用GSConv、VoVGSCSP来平衡精度和处理速度。ASF-ScalSeg优化了空间尺度特征，而DySample动态上采样功能在不影响效率的情况下提高了性能。在包括无人机和车辆视角在内的红外数据集中，我们的算法显示，与YOLOv8n相比，mAP50分别提高了2.9%和1%。此外，我们的方法减少了21.9%的参数，减少了13.4%的计算负担，推理速度分别提高了3.9%和7.8%。为了验证该方法的有效性和鲁棒性，我们在前红外和航空红外数据集上进行了实验。结果表明，与YOLOv8n相比，mAP50分别提高0.9%和1%，参数减少21.9%，计算负担减少13.4%，推理速度提高5.9%和7.8%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A yolov8-based lightweight detection model for different perspectives infrared images

In recent years, the development of edge devices has enabled the capture and detection of infrared images from different perspectives. However, infrared image detection from different perspectives often results in low accuracy, massive model parameters, and high computing burden. To address these challenges, we propose an enhanced lightweight infrared target detection method leveraging YOLOv8n. Our method adopts the ADown downsampling operation in the model’s backbone, which gradually reduces the spatial size of the feature map while increasing its depth. This operation enables the network to capture and process information at various scales more effectively, enhancing its feature representation capabilities. Additionally, our method incorporates the Triplet Attention mechanism to enhance the effectiveness of feature extraction. Finally, to optimize the feature pyramid and path aggregation network, we propose the ASF-DynamicScalSeq structure, utilizing GSConv, VoVGSCSP to balance accuracy and processing speed. The ASF-ScalSeg optimizes spatial-scale features, while the DySample dynamic upsampling function enhances performance without compromising efficiency. In infrared datasets, including those from UAV and vehicle perspectives, our algorithm shows a 2.9% and 1% increase in mAP50, respectively, compared to YOLOv8n. Additionally, our approach reduces parameters by 21.9%, decreases computational burden by 13.4%, and enhances inference speed by 3.9% and 7.8%, respectively. To validate the effectiveness and robustness of our method, we conduct experiments on FLIR and aerial infrared datasets. The results show a 0.9% and 1% increase in mAP50, a 21.9% reduction in parameters, a 13.4% decrease in computational burden, and a 5.9% and 7.8% improvement in inference speed, respectively, compared to YOLOv8n.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Optics Communications 物理-光学

CiteScore

5.10

自引率

8.30%

发文量

681

审稿时长

38 days

期刊介绍： Optics Communications invites original and timely contributions containing new results in various fields of optics and photonics. The journal considers theoretical and experimental research in areas ranging from the fundamental properties of light to technological applications. Topics covered include classical and quantum optics, optical physics and light-matter interactions, lasers, imaging, guided-wave optics and optical information processing. Manuscripts should offer clear evidence of novelty and significance. Papers concentrating on mathematical and computational issues, with limited connection to optics, are not suitable for publication in the Journal. Similarly, small technical advances, or papers concerned only with engineering applications or issues of materials science fall outside the journal scope.