Cyclist crash severity modeling: A hybrid approach of XGBoost-SHAP and random parameters logit with heterogeneity in means and variances

IF 3.9 2区 工程技术 Q1 ERGONOMICS
Antonella Scarano , Matin Sadeghi , Filomena Mauriello , Maria Rella Riccardi , Kayvan Aghabayk , Alfonso Montella
{"title":"Cyclist crash severity modeling: A hybrid approach of XGBoost-SHAP and random parameters logit with heterogeneity in means and variances","authors":"Antonella Scarano ,&nbsp;Matin Sadeghi ,&nbsp;Filomena Mauriello ,&nbsp;Maria Rella Riccardi ,&nbsp;Kayvan Aghabayk ,&nbsp;Alfonso Montella","doi":"10.1016/j.jsr.2025.04.003","DOIUrl":null,"url":null,"abstract":"<div><div><em>Introduction:</em> Across the globe, policymakers are focusing on boosting sustainable transport options, notably cycling, to foster eco-friendly urban environments. However, the persistent safety challenges cyclists face continues to hinder these efforts. <em>Method</em>: This research explores a novel hybrid methodology to investigate the determinants of cyclist crash severity by combining eXtreme Gradient Boosting (XGBoost) with SHapley Additive exPlanations (SHAP) and a random parameters logit model with heterogeneity in means and variances (RPLHMV). Using crash data from the Department for Transport covering crashes in Great Britain from 2016 to 2019, the research evaluates the methodology’s effectiveness. The XGBoost-SHAP model reduced data dimensionality allowing the application of a robust statistical model, while the random parameters logit model with heterogeneity in means and variances captured heterogeneity in both means and variances. <em>Results</em>: The statistical model identified 10 significant variables with fixed parameters for the fatal crashes, 22 significant variables for the serious injuries, and two indicator variables such as cyclist age ≤ 17 and overtaking as a manoeuvre for the second vehicle with statistically significant random parameters associated with serious injury outcomes. The relationships revealed by the logit framework were further examined using the XGBoost-SHAP, which provided deeper insights into the interactions between random and fixed parameters. The use of the hybrid approach allowed to achieve a very good R2 McFadden value of 0.52 for the RPLHMV, demonstrating the model’s robustness. <em>Conclusions</em>: The hybrid approach not only provides a deeper understanding of crash severity dynamics but also helps in creating specific safety measures. <em>Practical applications</em>: This research can guide policymakers in identifying key factors and interactions that affect crash severity, leading to targeted safety improvements.</div></div>","PeriodicalId":48224,"journal":{"name":"Journal of Safety Research","volume":"93 ","pages":"Pages 373-398"},"PeriodicalIF":3.9000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Safety Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022437525000611","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
引用次数: 0

Abstract

Introduction: Across the globe, policymakers are focusing on boosting sustainable transport options, notably cycling, to foster eco-friendly urban environments. However, the persistent safety challenges cyclists face continues to hinder these efforts. Method: This research explores a novel hybrid methodology to investigate the determinants of cyclist crash severity by combining eXtreme Gradient Boosting (XGBoost) with SHapley Additive exPlanations (SHAP) and a random parameters logit model with heterogeneity in means and variances (RPLHMV). Using crash data from the Department for Transport covering crashes in Great Britain from 2016 to 2019, the research evaluates the methodology’s effectiveness. The XGBoost-SHAP model reduced data dimensionality allowing the application of a robust statistical model, while the random parameters logit model with heterogeneity in means and variances captured heterogeneity in both means and variances. Results: The statistical model identified 10 significant variables with fixed parameters for the fatal crashes, 22 significant variables for the serious injuries, and two indicator variables such as cyclist age ≤ 17 and overtaking as a manoeuvre for the second vehicle with statistically significant random parameters associated with serious injury outcomes. The relationships revealed by the logit framework were further examined using the XGBoost-SHAP, which provided deeper insights into the interactions between random and fixed parameters. The use of the hybrid approach allowed to achieve a very good R2 McFadden value of 0.52 for the RPLHMV, demonstrating the model’s robustness. Conclusions: The hybrid approach not only provides a deeper understanding of crash severity dynamics but also helps in creating specific safety measures. Practical applications: This research can guide policymakers in identifying key factors and interactions that affect crash severity, leading to targeted safety improvements.
自行车碰撞严重程度建模:XGBoost-SHAP和随机参数logit在均值和方差异质性的混合方法
导言:在全球范围内,政策制定者正致力于推动可持续交通选择,尤其是自行车,以营造生态友好的城市环境。然而,骑车者面临的持续的安全挑战继续阻碍着这些努力。方法:结合极端梯度增强(XGBoost)和SHapley加性解释(SHAP)以及均值和方差异质性的随机参数logit模型(RPLHMV),探索了一种新的混合方法来研究自行车碰撞严重程度的决定因素。该研究利用英国运输部2016年至2019年的撞车数据,评估了该方法的有效性。XGBoost-SHAP模型降低了数据维度,允许应用稳健的统计模型,而具有均值和方差异质性的随机参数logit模型捕获了均值和方差的异质性。结果:统计模型识别出10个具有固定参数的致命碰撞显著变量,22个严重伤害显著变量,2个与严重伤害结果相关的随机参数具有统计显著性的指标变量,如骑自行车者年龄≤17岁和超车作为第二车的机动。使用XGBoost-SHAP进一步检查了logit框架所揭示的关系,这为随机参数和固定参数之间的相互作用提供了更深入的见解。混合方法的使用使RPLHMV的R2 McFadden值达到0.52,证明了模型的鲁棒性。结论:混合方法不仅提供了对碰撞严重性动态的更深入了解,而且有助于制定具体的安全措施。实际应用:本研究可以指导决策者识别影响碰撞严重程度的关键因素和相互作用,从而有针对性地提高安全性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
6.40
自引率
4.90%
发文量
174
审稿时长
61 days
期刊介绍: Journal of Safety Research is an interdisciplinary publication that provides for the exchange of ideas and scientific evidence capturing studies through research in all areas of safety and health, including traffic, workplace, home, and community. This forum invites research using rigorous methodologies, encourages translational research, and engages the global scientific community through various partnerships (e.g., this outreach includes highlighting some of the latest findings from the U.S. Centers for Disease Control and Prevention).
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信