{"title":"Predicting Road Traffic Collisions Using a Two-Layer Ensemble Machine Learning Algorithm","authors":"James Oduor Oyoo, J. Wekesa, Kennedy Ogada","doi":"10.3390/asi7020025","DOIUrl":null,"url":null,"abstract":"Road traffic collisions are among the world’s critical issues, causing many casualties, deaths, and economic losses, with a disproportionate burden falling on developing countries. Existing research has been conducted to analyze this situation using different approaches and techniques at different stretches and intersections. In this paper, we propose a two-layer ensemble machine learning (ML) technique to assess and predict road traffic collisions using data from a driving simulator. The first (base) layer integrates supervised learning techniques, namely k- Nearest Neighbors (k-NN), AdaBoost, Naive Bayes (NB), and Decision Trees (DT). The second layer predicts road collisions by combining the base layer outputs by employing the stacking ensemble method, using logistic regression as a meta-classifier. In addition, the synthetic minority oversampling technique (SMOTE) was performed to handle the data imbalance before training the model. To simplify the model, the particle swarm optimization (PSO) algorithm was used to select the most important features in our dataset. The proposed two-layer ensemble model had the best outcomes with an accuracy of 88%, an F1 score of 83%, and an AUC of 86% as compared with k-NN, DT, NB, and AdaBoost. The proposed two-layer ensemble model can be used in the future for theoretical as well as practical applications, such as road safety management for improving existing conditions of the road network and formulating traffic safety policies based on evidence.","PeriodicalId":36273,"journal":{"name":"Applied System Innovation","volume":null,"pages":null},"PeriodicalIF":3.8000,"publicationDate":"2024-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied System Innovation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/asi7020025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Road traffic collisions are among the world’s critical issues, causing many casualties, deaths, and economic losses, with a disproportionate burden falling on developing countries. Existing research has been conducted to analyze this situation using different approaches and techniques at different stretches and intersections. In this paper, we propose a two-layer ensemble machine learning (ML) technique to assess and predict road traffic collisions using data from a driving simulator. The first (base) layer integrates supervised learning techniques, namely k- Nearest Neighbors (k-NN), AdaBoost, Naive Bayes (NB), and Decision Trees (DT). The second layer predicts road collisions by combining the base layer outputs by employing the stacking ensemble method, using logistic regression as a meta-classifier. In addition, the synthetic minority oversampling technique (SMOTE) was performed to handle the data imbalance before training the model. To simplify the model, the particle swarm optimization (PSO) algorithm was used to select the most important features in our dataset. The proposed two-layer ensemble model had the best outcomes with an accuracy of 88%, an F1 score of 83%, and an AUC of 86% as compared with k-NN, DT, NB, and AdaBoost. The proposed two-layer ensemble model can be used in the future for theoretical as well as practical applications, such as road safety management for improving existing conditions of the road network and formulating traffic safety policies based on evidence.