{"title":"Preventing Traffic Accidents Through Machine Learning Predictive Models","authors":"Tarikwa Tesfa Bedane, Beakal Gizachew Assefa, Sudhir Kumar Mohapatra","doi":"10.1109/ict4da53266.2021.9672249","DOIUrl":null,"url":null,"abstract":"Road Traffic Accidents (RTA) are a serious issue of societies resulting in huge losses at the economic and social levels and responsible for millions of deaths and injuries every year in the world. For instance, in Ethiopia, the number of deaths due to traffic accidents is increasing from one year to another. Addis Ababa is one of the popular and known cities that encounter a high number of RTAs due to the increasing number of vehicles and population. The main objective of this paper is to apply machine learning algorithms to predict the accident severity and identify the major causes of accidents in crowded cities (application of Addis Ababa city). The required data are collected from Addis Ababa city police departments and 12316 records of the accident are used for data analysis. We applied seven machine learning classification algorithms (Logistic Regression, Naive Bayes, Decision Tree, Support Vector Machine, K Nearest Neighbor, Random Forest, and AdaBoost) for predicting accident severity and compared the performance to choose the best model. We applied random undersampling and SMOTE oversampling techniques to handle the class imbalance nature of the dependent features and Principal Component Analysis (PCA) for dimension reduction. The experimental result shows that Random Forest achieved a 93.76% F1 score with SMOTE over-sampled data set and about 18% feature size reduction. Moreover, light condition, driving experience, age band of the driver, type of road lane, and types of junctions are identified as major determinant factors of the accident. According to this study, these are major factors to RTA and need to be considered in the design of infrastructure, regulations and policies to reduce accidents.","PeriodicalId":371663,"journal":{"name":"2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Information and Communication Technology for Development for Africa (ICT4DA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ict4da53266.2021.9672249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Road Traffic Accidents (RTA) are a serious issue of societies resulting in huge losses at the economic and social levels and responsible for millions of deaths and injuries every year in the world. For instance, in Ethiopia, the number of deaths due to traffic accidents is increasing from one year to another. Addis Ababa is one of the popular and known cities that encounter a high number of RTAs due to the increasing number of vehicles and population. The main objective of this paper is to apply machine learning algorithms to predict the accident severity and identify the major causes of accidents in crowded cities (application of Addis Ababa city). The required data are collected from Addis Ababa city police departments and 12316 records of the accident are used for data analysis. We applied seven machine learning classification algorithms (Logistic Regression, Naive Bayes, Decision Tree, Support Vector Machine, K Nearest Neighbor, Random Forest, and AdaBoost) for predicting accident severity and compared the performance to choose the best model. We applied random undersampling and SMOTE oversampling techniques to handle the class imbalance nature of the dependent features and Principal Component Analysis (PCA) for dimension reduction. The experimental result shows that Random Forest achieved a 93.76% F1 score with SMOTE over-sampled data set and about 18% feature size reduction. Moreover, light condition, driving experience, age band of the driver, type of road lane, and types of junctions are identified as major determinant factors of the accident. According to this study, these are major factors to RTA and need to be considered in the design of infrastructure, regulations and policies to reduce accidents.