{"title":"Classification of Aviation Incident Causes using LGBM with Improved Cross-Validation","authors":"Xiaomei Ni, Huawei Wang, Lingzi Chen, Ruiguan Lin","doi":"10.23919/jsee.2024.000035","DOIUrl":null,"url":null,"abstract":"Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced machine learning algorithm. To assess aviation safety and identify the causes of incidents, a classification model with light gradient boosting machine (LGBM) based on the aviation safety reporting system (ASRS) has been developed. It is improved by k-fold cross-validation with hybrid sampling model (HSCV), which may boost classification performance and maintain data balance. The results show that employing the LGBM-HSCV model can significantly improve accuracy while alleviating data imbalance. Vertical comparison with other cross-validation (CV) methods and lateral comparison with different fold times comprise the comparative approach. Aside from the comparison, two further CV approaches based on the improved method in this study are discussed: one with a different sampling and folding order, and the other with more CV. According to the assessment indices with different methods, the LGBM-HSCV model proposed here is effective at detecting incident causes. The improved model for imbalanced data categorization proposed may serve as a point of reference for similar data processing, and the model's accurate identification of civil aviation incident causes can assist to improve civil aviation safety.","PeriodicalId":50030,"journal":{"name":"Journal of Systems Engineering and Electronics","volume":"41 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Engineering and Electronics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.23919/jsee.2024.000035","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Aviation accidents are currently one of the leading causes of significant injuries and deaths worldwide. This entices researchers to investigate aircraft safety using data analysis approaches based on an advanced machine learning algorithm. To assess aviation safety and identify the causes of incidents, a classification model with light gradient boosting machine (LGBM) based on the aviation safety reporting system (ASRS) has been developed. It is improved by k-fold cross-validation with hybrid sampling model (HSCV), which may boost classification performance and maintain data balance. The results show that employing the LGBM-HSCV model can significantly improve accuracy while alleviating data imbalance. Vertical comparison with other cross-validation (CV) methods and lateral comparison with different fold times comprise the comparative approach. Aside from the comparison, two further CV approaches based on the improved method in this study are discussed: one with a different sampling and folding order, and the other with more CV. According to the assessment indices with different methods, the LGBM-HSCV model proposed here is effective at detecting incident causes. The improved model for imbalanced data categorization proposed may serve as a point of reference for similar data processing, and the model's accurate identification of civil aviation incident causes can assist to improve civil aviation safety.