Road Crashes Analysis and Prediction using Gradient Boosted and Random Forest Trees

S. Elyassami, Yasir Hamid, T. Habuza
{"title":"Road Crashes Analysis and Prediction using Gradient Boosted and Random Forest Trees","authors":"S. Elyassami, Yasir Hamid, T. Habuza","doi":"10.1109/CiSt49399.2021.9357298","DOIUrl":null,"url":null,"abstract":"People lose their lives every day due to road traffic crashes. The problem is so humongous globally that the World Health Organization, in its Sustainable Development Agenda 2030, is inviting the coordinates efforts across nations towards it and aspiring to cut down the deaths and injuries to half. Taking a clue from that, the proposed work is undertaken to build machine learning-based models for analyzing the crash data, identifying the important risk factors, and predict the injury severity of drivers. The proposed work studied and analyzed several factors of road accidents to create an accurate and interpretable model that predicts the occurrence and severity of car accidents by investigating crash causal factors and crash severity factors. In the proposed work, we employed three machine learning algorithms to vis-à-vis Decision Tree, Random Forest, and Gradient Boosted tree on Statewide Vehicle Crashes Dataset provided by Maryland State Police. The gradient boosted-based model reported the highest prediction accuracy and provided the most influencing factors in the predictive model. The findings showed that disregarding traffic signals and stop signs, road design problems, poor visibility, and bad weather conditions are the most important variables in the predictive road traffic crash model. Using the identified risk factors is crucial in establishing actions that may reduce the risks related to those factors.","PeriodicalId":253233,"journal":{"name":"2020 6th IEEE Congress on Information Science and Technology (CiSt)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 6th IEEE Congress on Information Science and Technology (CiSt)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CiSt49399.2021.9357298","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

People lose their lives every day due to road traffic crashes. The problem is so humongous globally that the World Health Organization, in its Sustainable Development Agenda 2030, is inviting the coordinates efforts across nations towards it and aspiring to cut down the deaths and injuries to half. Taking a clue from that, the proposed work is undertaken to build machine learning-based models for analyzing the crash data, identifying the important risk factors, and predict the injury severity of drivers. The proposed work studied and analyzed several factors of road accidents to create an accurate and interpretable model that predicts the occurrence and severity of car accidents by investigating crash causal factors and crash severity factors. In the proposed work, we employed three machine learning algorithms to vis-à-vis Decision Tree, Random Forest, and Gradient Boosted tree on Statewide Vehicle Crashes Dataset provided by Maryland State Police. The gradient boosted-based model reported the highest prediction accuracy and provided the most influencing factors in the predictive model. The findings showed that disregarding traffic signals and stop signs, road design problems, poor visibility, and bad weather conditions are the most important variables in the predictive road traffic crash model. Using the identified risk factors is crucial in establishing actions that may reduce the risks related to those factors.
使用梯度增强和随机森林树的道路碰撞分析和预测
每天都有人因道路交通事故而丧生。这个问题在全球范围内是如此巨大,以至于世界卫生组织在其《2030年可持续发展议程》中邀请各国为此协调努力,并希望将死亡和受伤人数减少一半。以此为线索,提出的工作是建立基于机器学习的模型,用于分析碰撞数据,识别重要的风险因素,并预测驾驶员的伤害严重程度。本文研究和分析了道路交通事故的几个因素,通过调查事故原因因素和事故严重程度因素,建立了一个准确和可解释的模型,预测交通事故的发生和严重程度。在提出的工作中,我们使用了三种机器学习算法来访问-à-vis决策树、随机森林和梯度提升树,这些树是由马里兰州警察局提供的全州车辆碰撞数据集。在预测模型中,梯度增强模型预测精度最高,影响因子最多。研究结果表明,忽视交通信号和停车标志、道路设计问题、低能见度和恶劣天气条件是预测道路交通事故模型中最重要的变量。利用已确定的风险因素对于制定可能减少与这些因素有关的风险的行动至关重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信