{"title":"Branch-YOLO: An efficient object detector for thin structure objects like pantograph","authors":"Yaqian Li , Jiaqi Han , Haibin Li , Wenming Zhang","doi":"10.1016/j.dsp.2025.105121","DOIUrl":null,"url":null,"abstract":"<div><div>The pantograph is a critical component of the railway Pantograph-OCS system, making it essential to accurately detect its lifting and lowering states to ensure efficient and safe operation. However, there are two principal problems that hampers progress in accurate real-time detection. The pantograph's unique slender structure occupies only a few pixels, making effective feature extraction difficult and detection accuracy susceptible to interference from complex outdoor application scenarios. In this work, we aim to propose a generalized detection model for branch-like objects with thin structures and multi-scale sizes, using train pantograph states detection as a case study. To this end, we propose the Branch-YOLO based on YOLOv8, which is improved in two aspects, feature extraction and feature fusion. Firstly, we introduce the BrLayer which consists of BranchConv and EFAC (Extend Receptive Field Attention Convolution). The BranchConv can adaptively capture features of thin and tortuous local structures, while the EFAC contributes to expanding the effective receptive field. Subsequently, we propose a feature fusion network (BranchNet) to integrate multi-level semantic information and multi-scale features, significantly reducing background interference in detecting slender structure objects and enhancing the ability to perceive variable object scales. Besides, we propose the LckLayer (Lightweight Cross-Kernel Convolution layer) and introduce the FdM (Feature Decomposition module) in BranchNet for lightweight design, reducing the computational overhead and enhancing model efficiency. Branch-YOLO we proposed not only achieves the best performance on the multi-scale pantograph datasets but also attains an outstanding 42.9% AP on the COCO val2017 datasets, with 5.9M parameters and 16.8 GFLOPs.</div></div>","PeriodicalId":51011,"journal":{"name":"Digital Signal Processing","volume":"162 ","pages":"Article 105121"},"PeriodicalIF":2.9000,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Signal Processing","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1051200425001435","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The pantograph is a critical component of the railway Pantograph-OCS system, making it essential to accurately detect its lifting and lowering states to ensure efficient and safe operation. However, there are two principal problems that hampers progress in accurate real-time detection. The pantograph's unique slender structure occupies only a few pixels, making effective feature extraction difficult and detection accuracy susceptible to interference from complex outdoor application scenarios. In this work, we aim to propose a generalized detection model for branch-like objects with thin structures and multi-scale sizes, using train pantograph states detection as a case study. To this end, we propose the Branch-YOLO based on YOLOv8, which is improved in two aspects, feature extraction and feature fusion. Firstly, we introduce the BrLayer which consists of BranchConv and EFAC (Extend Receptive Field Attention Convolution). The BranchConv can adaptively capture features of thin and tortuous local structures, while the EFAC contributes to expanding the effective receptive field. Subsequently, we propose a feature fusion network (BranchNet) to integrate multi-level semantic information and multi-scale features, significantly reducing background interference in detecting slender structure objects and enhancing the ability to perceive variable object scales. Besides, we propose the LckLayer (Lightweight Cross-Kernel Convolution layer) and introduce the FdM (Feature Decomposition module) in BranchNet for lightweight design, reducing the computational overhead and enhancing model efficiency. Branch-YOLO we proposed not only achieves the best performance on the multi-scale pantograph datasets but also attains an outstanding 42.9% AP on the COCO val2017 datasets, with 5.9M parameters and 16.8 GFLOPs.
期刊介绍:
Digital Signal Processing: A Review Journal is one of the oldest and most established journals in the field of signal processing yet it aims to be the most innovative. The Journal invites top quality research articles at the frontiers of research in all aspects of signal processing. Our objective is to provide a platform for the publication of ground-breaking research in signal processing with both academic and industrial appeal.
The journal has a special emphasis on statistical signal processing methodology such as Bayesian signal processing, and encourages articles on emerging applications of signal processing such as:
• big data• machine learning• internet of things• information security• systems biology and computational biology,• financial time series analysis,• autonomous vehicles,• quantum computing,• neuromorphic engineering,• human-computer interaction and intelligent user interfaces,• environmental signal processing,• geophysical signal processing including seismic signal processing,• chemioinformatics and bioinformatics,• audio, visual and performance arts,• disaster management and prevention,• renewable energy,