Estimating the occurrence of broken rails in commuter railroads with machine learning algorithms

IF 1.7 4区 工程技术 Q3 ENGINEERING, CIVIL
Di Kang, Junyan Dai, Xiang Liu, Zheyong Bian, Asim Zaman, Xin Wang
{"title":"Estimating the occurrence of broken rails in commuter railroads with machine learning algorithms","authors":"Di Kang, Junyan Dai, Xiang Liu, Zheyong Bian, Asim Zaman, Xin Wang","doi":"10.1177/09544097241280848","DOIUrl":null,"url":null,"abstract":"Broken rail prevention is critical for ensuring track infrastructure safety. With the increasing availability of rail data, the opportunity for data-driven analyses emerges as a promising avenue for enhancing railroad safety. While previous research has predominantly concentrated on predicting broken rails within the context of freight railroads, the attention afforded to commuter railroads has been limited. To address this research gap, this paper presents an analytical modeling framework based on machine learning (ML) algorithms (including LightGBM, XGBoost, Random Forests, and Logistic Regression) to investigate the occurrence of broken rails on commuter rail segments. It leverages various features such as gradient, curvature, annual traffic, operational speed, and the history of prior rail defects. We use oversampling techniques, including ADASYN, random oversampling, and SMOTE, to address the issue of imbalanced data. This challenge arises due to the majority of commuter rail segments not experiencing any broken rails during the study period, resulting in a small sample size of broken rail instances. The findings indicate that, for the dataset employed in this study, LightGBM, in conjunction with random oversampling, exhibits superior performance. Based on the feature importance results, the critical factors influencing the prediction of broken rail occurrences on this commuter railroad are gradient, operational speed, and prior rail defects.","PeriodicalId":54567,"journal":{"name":"Proceedings of the Institution of Mechanical Engineers Part F-Journal of Rail and Rapid Transit","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Institution of Mechanical Engineers Part F-Journal of Rail and Rapid Transit","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1177/09544097241280848","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ENGINEERING, CIVIL","Score":null,"Total":0}
引用次数: 0

Abstract

Broken rail prevention is critical for ensuring track infrastructure safety. With the increasing availability of rail data, the opportunity for data-driven analyses emerges as a promising avenue for enhancing railroad safety. While previous research has predominantly concentrated on predicting broken rails within the context of freight railroads, the attention afforded to commuter railroads has been limited. To address this research gap, this paper presents an analytical modeling framework based on machine learning (ML) algorithms (including LightGBM, XGBoost, Random Forests, and Logistic Regression) to investigate the occurrence of broken rails on commuter rail segments. It leverages various features such as gradient, curvature, annual traffic, operational speed, and the history of prior rail defects. We use oversampling techniques, including ADASYN, random oversampling, and SMOTE, to address the issue of imbalanced data. This challenge arises due to the majority of commuter rail segments not experiencing any broken rails during the study period, resulting in a small sample size of broken rail instances. The findings indicate that, for the dataset employed in this study, LightGBM, in conjunction with random oversampling, exhibits superior performance. Based on the feature importance results, the critical factors influencing the prediction of broken rail occurrences on this commuter railroad are gradient, operational speed, and prior rail defects.
利用机器学习算法估算通勤铁路断轨发生率
预防断轨对于确保轨道基础设施安全至关重要。随着铁路数据可用性的不断提高,数据驱动分析成为提高铁路安全的一个大有可为的途径。以往的研究主要集中在货运铁路范围内的断轨预测,而对通勤铁路的关注则十分有限。为了弥补这一研究空白,本文提出了一种基于机器学习(ML)算法(包括 LightGBM、XGBoost、随机森林和逻辑回归)的分析建模框架,用于研究通勤铁路线段的断轨发生率。它利用了各种特征,如坡度、曲率、年交通量、运行速度和以前的轨道缺陷历史。我们使用超采样技术(包括 ADASYN、随机超采样和 SMOTE)来解决不平衡数据问题。由于大部分通勤轨道区段在研究期间未发生过任何断轨事件,导致断轨实例的样本量较小,因此出现了这一难题。研究结果表明,对于本研究采用的数据集,LightGBM 与随机超采样相结合,表现出卓越的性能。根据特征重要性结果,影响该通勤铁路断轨预测的关键因素是坡度、运行速度和先前的轨道缺陷。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
4.80
自引率
10.00%
发文量
91
审稿时长
7 months
期刊介绍: The Journal of Rail and Rapid Transit is devoted to engineering in its widest interpretation applicable to rail and rapid transit. The Journal aims to promote sharing of technical knowledge, ideas and experience between engineers and researchers working in the railway field.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信