Li Yue, Luyue Liu, Maoqing Li, Baodi Xiao, Xiaochun Wu
{"title":"基于集成XGBoost算法的C3列控系统车载设备文本故障识别研究","authors":"Li Yue, Luyue Liu, Maoqing Li, Baodi Xiao, Xiaochun Wu","doi":"10.1093/tse/tdac066","DOIUrl":null,"url":null,"abstract":"\n The robust guarantee of train control on-board equipment is inextricably linked to the safe functioning of a high-speed train. A fault diagnostic model of on-board equipment is built utilizing the integrated learning XGBoost (eXtreme Gradient Boosting) algorithm to help technicians assess the malfunction category of high-speed train control on-board equipment accurately and rapidly. XGBoost algorithm iterates multiple decision tree models to improve the accuracy of fault diagnosis by lifting the predicted residual and adding regular terms. To begin, the text features were extracted using the improved TF-IDF (Term Frequency–Inverse Document Frequency) approach, and 24 fault feature words were chosen and converted into weight word vectors. Secondly, considering the imbalanced fault categories in the data set, ADASYN (Adaptive Synthetic sampling) adaptive synthetically oversampling technique was used to synthesize a few category fault samples. Finally, the data samples were split into training and test sets based on the fault text data of CTCS-3 train control on-board equipment recorded by Guangzhou Railway Group maintenance personnel. The XGBoost model was utilized to realize the automatic fault location of the test set after optimized parameter tuning through grid search. Compared with other methods, the evaluation index of the XGBoost model was significantly improved. The diagnostic accuracy reached 95.43%, which verifies the effectiveness of the method in text fault diagnosis.","PeriodicalId":52804,"journal":{"name":"Transportation Safety and Environment","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2022-12-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Research on Text Fault Recognition for On-board Equipment of C3 Train Control System Based on Integrated XGBoost Algorithm\",\"authors\":\"Li Yue, Luyue Liu, Maoqing Li, Baodi Xiao, Xiaochun Wu\",\"doi\":\"10.1093/tse/tdac066\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"\\n The robust guarantee of train control on-board equipment is inextricably linked to the safe functioning of a high-speed train. A fault diagnostic model of on-board equipment is built utilizing the integrated learning XGBoost (eXtreme Gradient Boosting) algorithm to help technicians assess the malfunction category of high-speed train control on-board equipment accurately and rapidly. XGBoost algorithm iterates multiple decision tree models to improve the accuracy of fault diagnosis by lifting the predicted residual and adding regular terms. To begin, the text features were extracted using the improved TF-IDF (Term Frequency–Inverse Document Frequency) approach, and 24 fault feature words were chosen and converted into weight word vectors. Secondly, considering the imbalanced fault categories in the data set, ADASYN (Adaptive Synthetic sampling) adaptive synthetically oversampling technique was used to synthesize a few category fault samples. Finally, the data samples were split into training and test sets based on the fault text data of CTCS-3 train control on-board equipment recorded by Guangzhou Railway Group maintenance personnel. The XGBoost model was utilized to realize the automatic fault location of the test set after optimized parameter tuning through grid search. Compared with other methods, the evaluation index of the XGBoost model was significantly improved. The diagnostic accuracy reached 95.43%, which verifies the effectiveness of the method in text fault diagnosis.\",\"PeriodicalId\":52804,\"journal\":{\"name\":\"Transportation Safety and Environment\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2022-12-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transportation Safety and Environment\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1093/tse/tdac066\",\"RegionNum\":4,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"TRANSPORTATION SCIENCE & TECHNOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Safety and Environment","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1093/tse/tdac066","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
Research on Text Fault Recognition for On-board Equipment of C3 Train Control System Based on Integrated XGBoost Algorithm
The robust guarantee of train control on-board equipment is inextricably linked to the safe functioning of a high-speed train. A fault diagnostic model of on-board equipment is built utilizing the integrated learning XGBoost (eXtreme Gradient Boosting) algorithm to help technicians assess the malfunction category of high-speed train control on-board equipment accurately and rapidly. XGBoost algorithm iterates multiple decision tree models to improve the accuracy of fault diagnosis by lifting the predicted residual and adding regular terms. To begin, the text features were extracted using the improved TF-IDF (Term Frequency–Inverse Document Frequency) approach, and 24 fault feature words were chosen and converted into weight word vectors. Secondly, considering the imbalanced fault categories in the data set, ADASYN (Adaptive Synthetic sampling) adaptive synthetically oversampling technique was used to synthesize a few category fault samples. Finally, the data samples were split into training and test sets based on the fault text data of CTCS-3 train control on-board equipment recorded by Guangzhou Railway Group maintenance personnel. The XGBoost model was utilized to realize the automatic fault location of the test set after optimized parameter tuning through grid search. Compared with other methods, the evaluation index of the XGBoost model was significantly improved. The diagnostic accuracy reached 95.43%, which verifies the effectiveness of the method in text fault diagnosis.