Mona Ahmadi Rad , Lianne M. Lefsrud , Michael T. Hendry , Asdrubal Cheng Cen , Sara Soltaninejad
{"title":"Analysis of freight train passing a stop signal using machine learning: Application of XGBoost and SHAP","authors":"Mona Ahmadi Rad , Lianne M. Lefsrud , Michael T. Hendry , Asdrubal Cheng Cen , Sara Soltaninejad","doi":"10.1016/j.jrtpm.2025.100532","DOIUrl":null,"url":null,"abstract":"<div><div>Passing a Stop Signal (PASS) is a critical safety concern in railway operations, with the potential to cause serious accidents. This study investigates non-human contributing factors to PASS events in Canadian mainline freight operations using machine learning. We analyze incident narratives from the Rail Occurrence Database System (RODS) through text mining and enrich them with geospatial and weather data. We develop a binary classification model using XGBoost and interpret feature importance and interactions with SHAP (SHapley Additive exPlanations). To address class imbalance and improve model performance, we apply a custom sampling method, combined with hyperparameter tuning and data standardization. Key contributors to PASS events include sharp track curvature near signals, downhill grades, low atmospheric pressure, high relative humidity, non-clear weather, and heavy traffic—placing Rocky Mountain subdivisions among the highest-risk areas. The model also reveals that combinations of environmental conditions, such as low temperature, low pressure, and high humidity, increase the likelihood of PASS events by reducing visibility and braking effectiveness. This study offers methodological and empirical contributions by modelling complex operational contexts, incorporating underexplored environmental factors, and producing region-specific insights. The proposed framework informs proactive safety strategies and supports risk analysis in other linear infrastructure systems.</div></div>","PeriodicalId":51821,"journal":{"name":"Journal of Rail Transport Planning & Management","volume":"35 ","pages":"Article 100532"},"PeriodicalIF":2.7000,"publicationDate":"2025-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Rail Transport Planning & Management","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210970625000290","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0
Abstract
Passing a Stop Signal (PASS) is a critical safety concern in railway operations, with the potential to cause serious accidents. This study investigates non-human contributing factors to PASS events in Canadian mainline freight operations using machine learning. We analyze incident narratives from the Rail Occurrence Database System (RODS) through text mining and enrich them with geospatial and weather data. We develop a binary classification model using XGBoost and interpret feature importance and interactions with SHAP (SHapley Additive exPlanations). To address class imbalance and improve model performance, we apply a custom sampling method, combined with hyperparameter tuning and data standardization. Key contributors to PASS events include sharp track curvature near signals, downhill grades, low atmospheric pressure, high relative humidity, non-clear weather, and heavy traffic—placing Rocky Mountain subdivisions among the highest-risk areas. The model also reveals that combinations of environmental conditions, such as low temperature, low pressure, and high humidity, increase the likelihood of PASS events by reducing visibility and braking effectiveness. This study offers methodological and empirical contributions by modelling complex operational contexts, incorporating underexplored environmental factors, and producing region-specific insights. The proposed framework informs proactive safety strategies and supports risk analysis in other linear infrastructure systems.