{"title":"A hybrid data-mining framework for train rescheduling strategy pattern discovery","authors":"Rui Chen, Xu Ge, Ping Huang, Chao Wen","doi":"10.1093/tse/tdad007","DOIUrl":null,"url":null,"abstract":"\n This study presents a hybrid data-mining framework based on feature selection algorithms and clustering methods to perform the pattern discovery of high-speed railway train rescheduling strategies (RS). The proposed model is composed of two states. In the first state, decision tree, random forest, Gradient Boosting Decision Tree (GBDT), and eXtreme Gradient Boosting (XGBoost) models are used to investigate the importance of features. The features that have a high influence on RS are first selected. In the second state, a K-means clustering method is used to uncover the interdependences between RS and the influencing features, based on the results in the first state. The proposed method can determine the quantitative relationships between RS and influencing factors. The results clearly show the influences of the factors on RS, the possibilities of different train operation RS under different situations, as well as some key time periods and key trains that the controllers should pay more attention to. The research in this paper can help train traffic controllers better understand the train operation patterns and provides direction for optimizing rail traffic RS.","PeriodicalId":52804,"journal":{"name":"Transportation Safety and Environment","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Safety and Environment","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1093/tse/tdad007","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
This study presents a hybrid data-mining framework based on feature selection algorithms and clustering methods to perform the pattern discovery of high-speed railway train rescheduling strategies (RS). The proposed model is composed of two states. In the first state, decision tree, random forest, Gradient Boosting Decision Tree (GBDT), and eXtreme Gradient Boosting (XGBoost) models are used to investigate the importance of features. The features that have a high influence on RS are first selected. In the second state, a K-means clustering method is used to uncover the interdependences between RS and the influencing features, based on the results in the first state. The proposed method can determine the quantitative relationships between RS and influencing factors. The results clearly show the influences of the factors on RS, the possibilities of different train operation RS under different situations, as well as some key time periods and key trains that the controllers should pay more attention to. The research in this paper can help train traffic controllers better understand the train operation patterns and provides direction for optimizing rail traffic RS.