{"title":"A fast and lightweight train image fault detection model based on convolutional neural networks","authors":"Longxin Zhang, Wenliang Zeng, Peng Zhou, Xiaojun Deng, Jiayu Wu, Hong Wen","doi":"10.1016/j.imavis.2024.105380","DOIUrl":null,"url":null,"abstract":"<div><div>Trains play a vital role in the life of residents. Fault detection of trains is essential to ensuring their safe operation. Aiming at the problems of many parameters, slow detection speed, and low detection accuracy of the current train image fault detection model, a fast and lightweight train image fault detection model using convolutional neural network (FL-TINet) is proposed in this study. First, the joint depthwise separable convolution and divided-channel convolution strategy are applied to the feature extraction network in FL-TINet to reduce the number of parameters and computation amount in the backbone network, thereby increasing the detection speed. Second, a mixed attention mechanism is designed to make FL-TINet focus on key features. Finally, an improved discrete K-means clustering algorithm is designed to set the anchor boxes so that the anchor box can cover the object better, thereby improving the detection accuracy. Experimental results on PASCAL 2012 and train datasets show that FL-TINet can detect faults at 119 frames per second. Compared with the state-of-the-art CenterNet, RetinaNet, SSD, Faster R-CNN, MobileNet, YOLOv3, YOLOv4, YOLOv7-Tiny, YOLOv8_n and YOLOX-Tiny models, FL-TINet’s detection speed is increased by 96.37% on average, and it has higher detection accuracy and fewer parameters. The robustness test shows that FL-TINet can resist noise and illumination changes well.</div></div>","PeriodicalId":50374,"journal":{"name":"Image and Vision Computing","volume":"154 ","pages":"Article 105380"},"PeriodicalIF":4.2000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Image and Vision Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0262885624004852","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Trains play a vital role in the life of residents. Fault detection of trains is essential to ensuring their safe operation. Aiming at the problems of many parameters, slow detection speed, and low detection accuracy of the current train image fault detection model, a fast and lightweight train image fault detection model using convolutional neural network (FL-TINet) is proposed in this study. First, the joint depthwise separable convolution and divided-channel convolution strategy are applied to the feature extraction network in FL-TINet to reduce the number of parameters and computation amount in the backbone network, thereby increasing the detection speed. Second, a mixed attention mechanism is designed to make FL-TINet focus on key features. Finally, an improved discrete K-means clustering algorithm is designed to set the anchor boxes so that the anchor box can cover the object better, thereby improving the detection accuracy. Experimental results on PASCAL 2012 and train datasets show that FL-TINet can detect faults at 119 frames per second. Compared with the state-of-the-art CenterNet, RetinaNet, SSD, Faster R-CNN, MobileNet, YOLOv3, YOLOv4, YOLOv7-Tiny, YOLOv8_n and YOLOX-Tiny models, FL-TINet’s detection speed is increased by 96.37% on average, and it has higher detection accuracy and fewer parameters. The robustness test shows that FL-TINet can resist noise and illumination changes well.
期刊介绍:
Image and Vision Computing has as a primary aim the provision of an effective medium of interchange for the results of high quality theoretical and applied research fundamental to all aspects of image interpretation and computer vision. The journal publishes work that proposes new image interpretation and computer vision methodology or addresses the application of such methods to real world scenes. It seeks to strengthen a deeper understanding in the discipline by encouraging the quantitative comparison and performance evaluation of the proposed methodology. The coverage includes: image interpretation, scene modelling, object recognition and tracking, shape analysis, monitoring and surveillance, active vision and robotic systems, SLAM, biologically-inspired computer vision, motion analysis, stereo vision, document image understanding, character and handwritten text recognition, face and gesture recognition, biometrics, vision-based human-computer interaction, human activity and behavior understanding, data fusion from multiple sensor inputs, image databases.