Yi Li , Huiying Xu , Xinzhong Zhu , Xiao Huang , Hongbo Li
{"title":"THDet: A Lightweight and Efficient Traffic Helmet Object Detector based on YOLOv8","authors":"Yi Li , Huiying Xu , Xinzhong Zhu , Xiao Huang , Hongbo Li","doi":"10.1016/j.dsp.2024.104765","DOIUrl":null,"url":null,"abstract":"<div><p>Traffic helmet object detection is playing an increasing important role in the smart traffic fields. However, object size variation and small-shaped helmet detection has still been a challenging problem by reason of their poor visual appearance in the image. In this work, we present an efficient traffic helmet detector through feature enhancement and lightweight design based on YOLOv8n called THDet. Specifically, we employ the coordinate attention into C2f blocks combined with softmax activate function to achieve feature channel aggregation and strong non-linear expression of the backbone for further effective feature extraction; Next, Focal_CIoU loss function embedded with Focal Loss method is utilized for the more precise measure of various objects bounding box regression and balance of positive and negative examples during training; Then, a new lightweight detection head style is designed only with two proper position heads (P3 & P4) to perform final classification and localization, through this scheme saving the 33.7% parameters than baseline method. Finally, Attention Refined Features Module (ARFM) is built to calibrate the multi-scale fused features by introducing 3-D weights generated from SimAttention to boost the final detection accuracy. Extensive experiments have demonstrated that our proposed method realizes noticeable performance in terms of detection accuracy and inference speed compared with baseline YOLOv8n and many end-to-end detectors of similar model size. Concretely, THDet achieves 0.447 at the overall evaluation metric of <span><math><mi>m</mi><mi>A</mi><msub><mrow><mi>P</mi></mrow><mrow><mn>0.5</mn><mo>−</mo><mn>0.95</mn></mrow></msub></math></span>, accomplishing 3.2% detection accuracy improvement than YOLOv8n. Besides, THDet only holds 2.2M parameters with 295 FPS inference speed, reducing 33.4% parameters compared with YOLOv8n. The experimental results validate the effectiveness of our proposed method, showcasing that THDet outperforms the mainstream real-time detection algorithms in the terms of accuracy, inference speed and lightweight model design for traffic helmet object detection.</p></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"155 ","pages":"Article 104765"},"PeriodicalIF":2.9000,"publicationDate":"2024-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200424003907","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
Traffic helmet object detection is playing an increasing important role in the smart traffic fields. However, object size variation and small-shaped helmet detection has still been a challenging problem by reason of their poor visual appearance in the image. In this work, we present an efficient traffic helmet detector through feature enhancement and lightweight design based on YOLOv8n called THDet. Specifically, we employ the coordinate attention into C2f blocks combined with softmax activate function to achieve feature channel aggregation and strong non-linear expression of the backbone for further effective feature extraction; Next, Focal_CIoU loss function embedded with Focal Loss method is utilized for the more precise measure of various objects bounding box regression and balance of positive and negative examples during training; Then, a new lightweight detection head style is designed only with two proper position heads (P3 & P4) to perform final classification and localization, through this scheme saving the 33.7% parameters than baseline method. Finally, Attention Refined Features Module (ARFM) is built to calibrate the multi-scale fused features by introducing 3-D weights generated from SimAttention to boost the final detection accuracy. Extensive experiments have demonstrated that our proposed method realizes noticeable performance in terms of detection accuracy and inference speed compared with baseline YOLOv8n and many end-to-end detectors of similar model size. Concretely, THDet achieves 0.447 at the overall evaluation metric of , accomplishing 3.2% detection accuracy improvement than YOLOv8n. Besides, THDet only holds 2.2M parameters with 295 FPS inference speed, reducing 33.4% parameters compared with YOLOv8n. The experimental results validate the effectiveness of our proposed method, showcasing that THDet outperforms the mainstream real-time detection algorithms in the terms of accuracy, inference speed and lightweight model design for traffic helmet object detection.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,