Understanding the effects of underreporting on injury severity estimation of single-vehicle motorcycle crashes: A hybrid approach incorporating majority class oversampling and random parameters with heterogeneity-in-means

IF 12.5 1区 工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH
Nawaf Alnawmasi , Apostolos Ziakopoulos , Athanasios Theofilatos , Yasir Ali
{"title":"Understanding the effects of underreporting on injury severity estimation of single-vehicle motorcycle crashes: A hybrid approach incorporating majority class oversampling and random parameters with heterogeneity-in-means","authors":"Nawaf Alnawmasi ,&nbsp;Apostolos Ziakopoulos ,&nbsp;Athanasios Theofilatos ,&nbsp;Yasir Ali","doi":"10.1016/j.amar.2025.100372","DOIUrl":null,"url":null,"abstract":"<div><div>The underreporting of crash data is a well-documented issue in road safety literature, but few studies have focused on addressing this problem in the context of analyzing crash injury severities. This paper aims to provide an empirical assessment of the impact of underreporting issue using a hybrid approach in estimating injury severity for single-vehicle motorcycle crashes. Unlike traditional machine learning methods that oversample the minority class (the category with the fewer observations such as fatal and severe injuries), the present study oversamples the majority class (i.e. minor injuries), which are often underreported in crash datasets, thus providing a fresh perspective on this issue. Afterwards, random parameter models with heterogeneity in means and variances were applied. The results of this study, as supported by the likelihood ratio tests, indicate that the key variables influencing motorcyclists’ injury severities remain consistent across both original and oversampled data models. Specifically, crashes occurring during slowing down or stopping are associated with lower injury severity, whereas negotiating a right turn increases the probability of severe injuries. Interestingly, crashes that occur on dry pavements are associated with higher injury severity when compared to wet pavements, likely due to rider behavior adjustments in adverse weather conditions to compensate for the risk. Overall, the oversampled models have a significantly lower marginal effects values compared to the original model’s marginal effects. This study provides a foundation for further examination of underreporting issue in crash injury severity modelling and also highlights the need to capture the dynamics of crash injuries suggesting that alternative approaches could improve the understanding and hence road safety management. Future studies are encouraged to replicate this methodology to validate the findings as well as utilize other advanced machine learning algorithms, like tree-based models to assess underreporting mitigation.</div></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"45 ","pages":"Article 100372"},"PeriodicalIF":12.5000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytic Methods in Accident Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S221366572500003X","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0

Abstract

The underreporting of crash data is a well-documented issue in road safety literature, but few studies have focused on addressing this problem in the context of analyzing crash injury severities. This paper aims to provide an empirical assessment of the impact of underreporting issue using a hybrid approach in estimating injury severity for single-vehicle motorcycle crashes. Unlike traditional machine learning methods that oversample the minority class (the category with the fewer observations such as fatal and severe injuries), the present study oversamples the majority class (i.e. minor injuries), which are often underreported in crash datasets, thus providing a fresh perspective on this issue. Afterwards, random parameter models with heterogeneity in means and variances were applied. The results of this study, as supported by the likelihood ratio tests, indicate that the key variables influencing motorcyclists’ injury severities remain consistent across both original and oversampled data models. Specifically, crashes occurring during slowing down or stopping are associated with lower injury severity, whereas negotiating a right turn increases the probability of severe injuries. Interestingly, crashes that occur on dry pavements are associated with higher injury severity when compared to wet pavements, likely due to rider behavior adjustments in adverse weather conditions to compensate for the risk. Overall, the oversampled models have a significantly lower marginal effects values compared to the original model’s marginal effects. This study provides a foundation for further examination of underreporting issue in crash injury severity modelling and also highlights the need to capture the dynamics of crash injuries suggesting that alternative approaches could improve the understanding and hence road safety management. Future studies are encouraged to replicate this methodology to validate the findings as well as utilize other advanced machine learning algorithms, like tree-based models to assess underreporting mitigation.
了解漏报对单辆摩托车碰撞伤害严重程度估计的影响:一种结合多数类过抽样和随机参数的混合方法
在道路安全文献中,碰撞数据的漏报是一个有充分记录的问题,但很少有研究集中在分析碰撞伤害严重程度的背景下解决这个问题。本文旨在使用混合方法对漏报问题的影响进行实证评估,以估计单车摩托车碰撞的伤害严重程度。与传统的机器学习方法(对少数类(致命和严重伤害等观察较少的类别)进行过采样不同,本研究对大多数类(即轻伤)进行过采样,这在碰撞数据集中经常被低估,从而为这个问题提供了一个新的视角。然后,采用均值和方差均异质性的随机参数模型。本研究的结果得到似然比检验的支持,表明影响摩托车手伤害严重程度的关键变量在原始和过抽样数据模型中保持一致。具体来说,在减速或停车时发生的撞车事故与较低的受伤严重程度有关,而右转则增加了严重受伤的可能性。有趣的是,与湿路面相比,在干燥路面上发生的撞车事故与更高的伤害严重程度有关,这可能是由于骑手在恶劣天气条件下调整行为以补偿风险。总体而言,过采样模型的边际效应值明显低于原始模型的边际效应值。这项研究为进一步研究碰撞伤害严重程度建模中的漏报问题提供了基础,也强调了捕捉碰撞伤害动态的必要性,这表明替代方法可以提高对碰撞伤害的理解,从而提高道路安全管理。鼓励未来的研究复制这种方法来验证研究结果,并利用其他先进的机器学习算法,如基于树的模型来评估低报缓解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
22.10
自引率
34.10%
发文量
35
审稿时长
24 days
期刊介绍: Analytic Methods in Accident Research is a journal that publishes articles related to the development and application of advanced statistical and econometric methods in studying vehicle crashes and other accidents. The journal aims to demonstrate how these innovative approaches can provide new insights into the factors influencing the occurrence and severity of accidents, thereby offering guidance for implementing appropriate preventive measures. While the journal primarily focuses on the analytic approach, it also accepts articles covering various aspects of transportation safety (such as road, pedestrian, air, rail, and water safety), construction safety, and other areas where human behavior, machine failures, or system failures lead to property damage or bodily harm.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信