VBM-YOLO: an enhanced YOLO model with reduced information loss for vehicle body markers detection.

IF 2.5 4区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

PeerJ Computer Science Pub Date : 2025-06-02 eCollection Date: 2025-01-01 DOI:10.7717/peerj-cs.2932

Bin Wang, Chao Li, Chao Zhou, Jun Sun

{"title":"VBM-YOLO: an enhanced YOLO model with reduced information loss for vehicle body markers detection.","authors":"Bin Wang, Chao Li, Chao Zhou, Jun Sun","doi":"10.7717/peerj-cs.2932","DOIUrl":null,"url":null,"abstract":"<p><p>In vehicle safety detection, the accurate identification of body markers on medium and large vehicles plays a critical role in ensuring safe road travel. To address the issues of the feature and gradient information loss in previous You Only Look Once (YOLO) series models, a novel Vehicle Body Markers YOLO (VBM-YOLO) model has been designed. Firstly, the model integrates the cross-spatial-channel attention (CSCA) mechanism proposed in this study. The CSCA uses cross-dimensional information to address interaction issues during the fusion of spatial and channel dimensions, significantly enhancing the model's representational capacity. Secondly, we propose a multi-scale selective feature pyramid network (MSSFPN). By a progressive fusion approach and multi-scale feature selection learning, MSSFPN alleviates the issues of feature loss and target layer information confusion caused by traditional top-down and bottom-up feature pyramids. Finally, an auxiliary gradient branch (AGB) is proposed. During training, AGB incorporates feature information from different target layers to help the current layer retain complete gradient information. Additionally, the AGB branch does not participate in model inference, thereby reducing additional overhead. Experimental results demonstrate that VBM-YOLO improves mean average precision (mAP) by 2.3% and 4.3% at intersection over union (IoU) thresholds of 0.5 and 0.5:0.95, respectively, compared to YOLOv8s on the vehicle body markers dataset. VBM-YOLO also achieves a better balance between accuracy and computational resources than other mainstream models, exhibiting good generalization performance on public datasets like PASCAL VOC and D-Fire.</p>","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"11 ","pages":"e2932"},"PeriodicalIF":2.5000,"publicationDate":"2025-06-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12193416/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2932","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/1 0:00:00","PubModel":"eCollection","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

In vehicle safety detection, the accurate identification of body markers on medium and large vehicles plays a critical role in ensuring safe road travel. To address the issues of the feature and gradient information loss in previous You Only Look Once (YOLO) series models, a novel Vehicle Body Markers YOLO (VBM-YOLO) model has been designed. Firstly, the model integrates the cross-spatial-channel attention (CSCA) mechanism proposed in this study. The CSCA uses cross-dimensional information to address interaction issues during the fusion of spatial and channel dimensions, significantly enhancing the model's representational capacity. Secondly, we propose a multi-scale selective feature pyramid network (MSSFPN). By a progressive fusion approach and multi-scale feature selection learning, MSSFPN alleviates the issues of feature loss and target layer information confusion caused by traditional top-down and bottom-up feature pyramids. Finally, an auxiliary gradient branch (AGB) is proposed. During training, AGB incorporates feature information from different target layers to help the current layer retain complete gradient information. Additionally, the AGB branch does not participate in model inference, thereby reducing additional overhead. Experimental results demonstrate that VBM-YOLO improves mean average precision (mAP) by 2.3% and 4.3% at intersection over union (IoU) thresholds of 0.5 and 0.5:0.95, respectively, compared to YOLOv8s on the vehicle body markers dataset. VBM-YOLO also achieves a better balance between accuracy and computational resources than other mainstream models, exhibiting good generalization performance on public datasets like PASCAL VOC and D-Fire.

查看原文本刊更多论文

VBM-YOLO：一种增强的YOLO模型，减少了车身标记检测的信息损失。

在车辆安全检测中，中大型车辆车身标志的准确识别对于保证道路安全行驶起着至关重要的作用。针对以往YOLO （You Only Look Once）系列模型中特征信息和梯度信息丢失的问题，设计了一种新的车身标记YOLO （VBM-YOLO）模型。首先，该模型整合了本研究提出的跨空间通道注意（CSCA）机制。CSCA使用跨维度信息来解决空间和渠道维度融合过程中的交互问题，显著增强了模型的表征能力。其次，提出了一种多尺度选择性特征金字塔网络（MSSFPN）。MSSFPN通过渐进融合和多尺度特征选择学习，缓解了传统自顶向下和自底向上特征金字塔导致的特征丢失和目标层信息混乱的问题。最后，提出了一种辅助梯度分支（AGB）。在训练过程中，AGB融合了来自不同目标层的特征信息，以帮助当前层保留完整的梯度信息。此外，AGB分支不参与模型推理，从而减少了额外的开销。实验结果表明，与YOLOv8s相比，VBM-YOLO在0.5和0.5:0.95阈值下的平均精度（mAP）分别提高了2.3%和4.3%。与其他主流模型相比，VBM-YOLO在准确率和计算资源之间取得了更好的平衡，在PASCAL VOC和D-Fire等公共数据集上表现出良好的泛化性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

PeerJ Computer Science Computer Science-General Computer Science

CiteScore

6.10

自引率

5.30%

发文量

332

审稿时长

10 weeks

期刊介绍： PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.