Pre-crash injury risk prediction with guaranteed confidence level: a conformal and interpretable framework.

IF 1.9 3区工程技术 Q3 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH

Traffic Injury Prevention Pub Date : 2025-08-19 DOI:10.1080/15389588.2025.2538725

Junhao Wei, Yusuke Miyazaki, Fusako Sato

{"title":"Pre-crash injury risk prediction with guaranteed confidence level: a conformal and interpretable framework.","authors":"Junhao Wei, Yusuke Miyazaki, Fusako Sato","doi":"10.1080/15389588.2025.2538725","DOIUrl":null,"url":null,"abstract":"Objective: Pre-crash injury risk prediction is crucial for proactive safety measures, while traditional models, which output single-point predictions without explaining the decision reasons, often lack interpretability and reliable uncertainty estimation to reflect potential risk distributions. These drawbacks limit their practical effectiveness in mitigating injury severity. To overcome these limitations, this study develops a novel framework that outputs potential risk distributions and their corresponding probabilities using only pre-crash data, thereby delivering probabilistic outputs with a statistically guaranteed 90% confidence level. By introducing such a framework, we aim to provide a more convincing and interpretable analysis of the injury distribution and its underlying causes in traffic accidents, ultimately offering data-driven guidance for injury mitigation strategies.Methods: Data from the National Automotive Sampling System-Crashworthiness Data System and the Crash Investigation Sampling System were used, incorporating 28 pre-crash risk factors. Several machine learning models, including ensemble methods and the deep learning model TabNet, were evaluated. To address the significant class imbalance, particularly the limited number of serious injury cases, various resampling strategies were applied. The core contribution lies in integrating conformal prediction methods, both naive and class-conditional, to generate prediction sets at a 90% confidence level. Model performance was assessed via global evaluation metrics (i.e., f1-score) and serious injury recall, and interpretability was enhanced using explainable machine learning and statistical analysis.Results: Comparative experiments indicate a nearly 90% prediction coverage and a 70.3% recall rate for serious injuries by proposed framework, which is significantly higher than those reported in related studies. Further model interpretation highlights key risk factors such as intersection relevance, crash type, and speed limits and how they effect injury severity prediction.Conclusions: Proposed framework demonstrates significant potential in pre-crash injury risk prediction by introducing conformal prediction techniques to machine learning models. In addition to enhancing predictive performance to nearly 90% prediction coverage and a 70.3% recall rate for serious injuries, this framework also provides enhanced interpretability by quantifying prediction uncertainty and identifying key risk factors. Unlike traditional methods, the framework remains valid under distribution shifts and combines uncertainty estimation with model interpretability. These advantages collectively lay a foundation for developing proactive traffic safety applications and formulating data-driven road safety policies.","PeriodicalId":54422,"journal":{"name":"Traffic Injury Prevention","volume":" ","pages":"1-11"},"PeriodicalIF":1.9000,"publicationDate":"2025-08-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Traffic Injury Prevention","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1080/15389588.2025.2538725","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 0

Abstract

Objective: Pre-crash injury risk prediction is crucial for proactive safety measures, while traditional models, which output single-point predictions without explaining the decision reasons, often lack interpretability and reliable uncertainty estimation to reflect potential risk distributions. These drawbacks limit their practical effectiveness in mitigating injury severity. To overcome these limitations, this study develops a novel framework that outputs potential risk distributions and their corresponding probabilities using only pre-crash data, thereby delivering probabilistic outputs with a statistically guaranteed 90% confidence level. By introducing such a framework, we aim to provide a more convincing and interpretable analysis of the injury distribution and its underlying causes in traffic accidents, ultimately offering data-driven guidance for injury mitigation strategies.

Methods: Data from the National Automotive Sampling System-Crashworthiness Data System and the Crash Investigation Sampling System were used, incorporating 28 pre-crash risk factors. Several machine learning models, including ensemble methods and the deep learning model TabNet, were evaluated. To address the significant class imbalance, particularly the limited number of serious injury cases, various resampling strategies were applied. The core contribution lies in integrating conformal prediction methods, both naive and class-conditional, to generate prediction sets at a 90% confidence level. Model performance was assessed via global evaluation metrics (i.e., f1-score) and serious injury recall, and interpretability was enhanced using explainable machine learning and statistical analysis.

Results: Comparative experiments indicate a nearly 90% prediction coverage and a 70.3% recall rate for serious injuries by proposed framework, which is significantly higher than those reported in related studies. Further model interpretation highlights key risk factors such as intersection relevance, crash type, and speed limits and how they effect injury severity prediction.

Conclusions: Proposed framework demonstrates significant potential in pre-crash injury risk prediction by introducing conformal prediction techniques to machine learning models. In addition to enhancing predictive performance to nearly 90% prediction coverage and a 70.3% recall rate for serious injuries, this framework also provides enhanced interpretability by quantifying prediction uncertainty and identifying key risk factors. Unlike traditional methods, the framework remains valid under distribution shifts and combines uncertainty estimation with model interpretability. These advantages collectively lay a foundation for developing proactive traffic safety applications and formulating data-driven road safety policies.

查看原文本刊更多论文

预碰撞损伤风险预测与保证置信水平：一个适形和可解释的框架。

目的：碰撞前伤害风险预测对主动安全措施至关重要，而传统模型输出单点预测而不解释决策原因，往往缺乏可解释性和可靠的不确定性估计来反映潜在的风险分布。这些缺点限制了它们在减轻损伤严重程度方面的实际效果。为了克服这些限制，本研究开发了一个新的框架，该框架仅使用崩溃前的数据输出潜在风险分布及其相应的概率，从而提供具有统计保证90%置信度的概率输出。通过引入这样一个框架，我们的目标是对交通事故中的伤害分布及其潜在原因提供更有说服力和可解释性的分析，最终为减少伤害策略提供数据驱动的指导。方法：采用国家汽车抽样系统-耐撞性数据系统和碰撞调查抽样系统的数据，纳入28个碰撞前风险因素。评估了几种机器学习模型，包括集成方法和深度学习模型TabNet。为了解决严重的类别不平衡，特别是严重伤害案例数量有限的问题，采用了各种重新采样策略。核心贡献在于整合了朴素和类别条件的共形预测方法，以生成90%置信度的预测集。通过全局评估指标（即f1-score）和严重伤害回忆来评估模型的性能，并使用可解释的机器学习和统计分析来增强可解释性。结果：对比实验表明，该框架对严重伤害的预测覆盖率接近90%，召回率为70.3%，显著高于相关研究报告。进一步的模型解释强调了关键的风险因素，如交叉口相关性、碰撞类型和速度限制，以及它们如何影响伤害严重程度预测。结论：通过将保形预测技术引入机器学习模型，所提出的框架在碰撞前损伤风险预测方面显示出巨大的潜力。除了将预测性能提高到近90%的预测覆盖率和70.3%的严重伤害召回率外，该框架还通过量化预测不确定性和识别关键风险因素来提高可解释性。与传统方法不同，该框架在分布变化下仍然有效，并将不确定性估计与模型可解释性相结合。这些优势共同为开发主动交通安全应用和制定数据驱动的道路安全政策奠定了基础。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Traffic Injury Prevention PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH-

CiteScore

3.60

自引率

10.00%

发文量

137

审稿时长

3 months

期刊介绍： The purpose of Traffic Injury Prevention is to bridge the disciplines of medicine, engineering, public health and traffic safety in order to foster the science of traffic injury prevention. The archival journal focuses on research, interventions and evaluations within the areas of traffic safety, crash causation, injury prevention and treatment. General topics within the journal''s scope are driver behavior, road infrastructure, emerging crash avoidance technologies, crash and injury epidemiology, alcohol and drugs, impact injury biomechanics, vehicle crashworthiness, occupant restraints, pedestrian safety, evaluation of interventions, economic consequences and emergency and clinical care with specific application to traffic injury prevention. The journal includes full length papers, review articles, case studies, brief technical notes and commentaries.