Branch-YOLO: An efficient object detector for thin structure objects like pantograph

IF 2.9 3区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC
Yaqian Li , Jiaqi Han , Haibin Li , Wenming Zhang
{"title":"Branch-YOLO: An efficient object detector for thin structure objects like pantograph","authors":"Yaqian Li ,&nbsp;Jiaqi Han ,&nbsp;Haibin Li ,&nbsp;Wenming Zhang","doi":"10.1016/j.dsp.2025.105121","DOIUrl":null,"url":null,"abstract":"<div><div>The pantograph is a critical component of the railway Pantograph-OCS system, making it essential to accurately detect its lifting and lowering states to ensure efficient and safe operation. However, there are two principal problems that hampers progress in accurate real-time detection. The pantograph's unique slender structure occupies only a few pixels, making effective feature extraction difficult and detection accuracy susceptible to interference from complex outdoor application scenarios. In this work, we aim to propose a generalized detection model for branch-like objects with thin structures and multi-scale sizes, using train pantograph states detection as a case study. To this end, we propose the Branch-YOLO based on YOLOv8, which is improved in two aspects, feature extraction and feature fusion. Firstly, we introduce the BrLayer which consists of BranchConv and EFAC (Extend Receptive Field Attention Convolution). The BranchConv can adaptively capture features of thin and tortuous local structures, while the EFAC contributes to expanding the effective receptive field. Subsequently, we propose a feature fusion network (BranchNet) to integrate multi-level semantic information and multi-scale features, significantly reducing background interference in detecting slender structure objects and enhancing the ability to perceive variable object scales. Besides, we propose the LckLayer (Lightweight Cross-Kernel Convolution layer) and introduce the FdM (Feature Decomposition module) in BranchNet for lightweight design, reducing the computational overhead and enhancing model efficiency. Branch-YOLO we proposed not only achieves the best performance on the multi-scale pantograph datasets but also attains an outstanding 42.9% AP on the COCO val2017 datasets, with 5.9M parameters and 16.8 GFLOPs.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"162 ","pages":"Article 105121"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425001435","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

The pantograph is a critical component of the railway Pantograph-OCS system, making it essential to accurately detect its lifting and lowering states to ensure efficient and safe operation. However, there are two principal problems that hampers progress in accurate real-time detection. The pantograph's unique slender structure occupies only a few pixels, making effective feature extraction difficult and detection accuracy susceptible to interference from complex outdoor application scenarios. In this work, we aim to propose a generalized detection model for branch-like objects with thin structures and multi-scale sizes, using train pantograph states detection as a case study. To this end, we propose the Branch-YOLO based on YOLOv8, which is improved in two aspects, feature extraction and feature fusion. Firstly, we introduce the BrLayer which consists of BranchConv and EFAC (Extend Receptive Field Attention Convolution). The BranchConv can adaptively capture features of thin and tortuous local structures, while the EFAC contributes to expanding the effective receptive field. Subsequently, we propose a feature fusion network (BranchNet) to integrate multi-level semantic information and multi-scale features, significantly reducing background interference in detecting slender structure objects and enhancing the ability to perceive variable object scales. Besides, we propose the LckLayer (Lightweight Cross-Kernel Convolution layer) and introduce the FdM (Feature Decomposition module) in BranchNet for lightweight design, reducing the computational overhead and enhancing model efficiency. Branch-YOLO we proposed not only achieves the best performance on the multi-scale pantograph datasets but also attains an outstanding 42.9% AP on the COCO val2017 datasets, with 5.9M parameters and 16.8 GFLOPs.
受电弓是铁路受电弓-OCS 系统的重要组成部分,因此必须准确检测其升降状态,以确保高效安全地运行。然而,有两个主要问题阻碍了精确实时检测的进展。受电弓独特的细长结构仅占几个像素,因此难以进行有效的特征提取,检测精度也容易受到复杂室外应用场景的干扰。在这项工作中,我们旨在以火车受电弓状态检测为例,提出一种针对结构细长、尺寸多尺度的树枝状物体的通用检测模型。为此,我们在 YOLOv8 的基础上提出了 Branch-YOLO,并在特征提取和特征融合两个方面进行了改进。首先,我们引入了由 BranchConv 和 EFAC(扩展感受野注意卷积)组成的 BrLayer。BranchConv 可以自适应地捕捉细小和迂回的局部结构特征,而 EFAC 则有助于扩大有效感受野。随后,我们提出了一个特征融合网络(BranchNet)来整合多层次语义信息和多尺度特征,从而显著降低了检测细长结构物体时的背景干扰,并增强了感知不同物体尺度的能力。此外,我们还提出了轻量级跨核卷积层(LckLayer),并在 BranchNet 中引入了特征分解模块(FdM),实现了轻量级设计,降低了计算开销,提高了模型效率。我们提出的 Branch-YOLO 不仅在多尺度受电弓数据集上实现了最佳性能,而且在 COCO val2017 数据集上实现了 42.9% 的出色 AP,参数为 5.9 百万,GFLOPs 为 16.8。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Digital Signal Processing
Digital Signal Processing 工程技术-工程:电子与电气
CiteScore
5.30
自引率
17.20%
发文量
435
审稿时长
66 days
期刊介绍: Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal. The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as: • big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信