IF 3.9 2区 工程技术 Q1 ERGONOMICS
Nuri Park , Juneyoung Park , Chris Lee
{"title":"Conditional Generative Adversarial Network-Based roadway crash risk prediction considering heterogeneity with dynamic data","authors":"Nuri Park ,&nbsp;Juneyoung Park ,&nbsp;Chris Lee","doi":"10.1016/j.jsr.2024.12.001","DOIUrl":null,"url":null,"abstract":"<div><div><em>Introduction</em>: Roadway crash data are very rare and occur randomly, therefore there are several challenges to developing a crash prediction model for real-time traffic safety management. Recently, to resolve the problem of crash data sample size, researchers have conducted studies on crash data augmentation using machine learning techniques for developing safety evaluation models. However, it’s important to incorporate the specific characteristics of crash data into augmentation and crash risk assessment, as these characteristics vary depending on spatial and temporal conditions. <em>Method:</em> Therefore, this study developed a real-time crash risk model in three stages. First, crash data were clustered to define heterogeneous crash risk situations and then, key variables were derived by the ensemble and explainable artificial intelligence techniques, Boruta-SHAP. Second, augmentation of each clustered crash data was performed using oversampling techniques including Conditional Generative Adversarial Network (CGAN), which can consider each crash risk cluster’s characteristics. Finally, crash risk models were developed and compared with other crash risk models developed by using binary logistic regression model (BLM), Random Forest (RF), extreme gradient boosting (XGBoost), and Support Vector Machine (SVM). <em>Results:</em> The results showed that the CGAN-based XGBoost model has the best performance and the variable of the temporal speed difference at 10-minute intervals and the precipitation variable have a large impact on crash risk prediction. This paper emphasizes that crash risk characteristics must be distinguished in crash risk prediction and provides new insights into addressing the imbalance data issue within crash and non-crash datasets.</div></div>","PeriodicalId":48224,"journal":{"name":"Journal of Safety Research","volume":"92 ","pages":"Pages 217-229"},"PeriodicalIF":3.9000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Safety Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0022437524002093","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
引用次数: 0

摘要

道路碰撞数据非常罕见且随机发生,因此开发用于实时交通安全管理的碰撞预测模型存在一些挑战。最近,为了解决碰撞数据样本量的问题,研究人员利用机器学习技术进行了碰撞数据增强的研究,以开发安全评估模型。然而,将碰撞数据的特定特征纳入增强和碰撞风险评估非常重要,因为这些特征会根据空间和时间条件而变化。方法:为此,本研究建立了一个分三个阶段的实时碰撞风险模型。首先,对碰撞数据进行聚类以定义异构碰撞风险情况,然后通过集成和可解释的人工智能技术Boruta-SHAP导出关键变量。其次,使用包括条件生成对抗网络(CGAN)在内的过采样技术对每个聚类碰撞数据进行增强,该技术可以考虑每个碰撞风险聚类的特征。最后,利用二元逻辑回归模型(BLM)、随机森林模型(RF)、极端梯度增强模型(XGBoost)和支持向量机模型(SVM)建立了碰撞风险模型,并与其他碰撞风险模型进行了比较。结果:基于cgan的XGBoost模型性能最好,10分钟间隔时间速度差变量和降水变量对碰撞风险预测影响较大。本文强调在碰撞风险预测中必须区分碰撞风险特征,并为解决碰撞和非碰撞数据集中数据不平衡问题提供了新的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Conditional Generative Adversarial Network-Based roadway crash risk prediction considering heterogeneity with dynamic data
Introduction: Roadway crash data are very rare and occur randomly, therefore there are several challenges to developing a crash prediction model for real-time traffic safety management. Recently, to resolve the problem of crash data sample size, researchers have conducted studies on crash data augmentation using machine learning techniques for developing safety evaluation models. However, it’s important to incorporate the specific characteristics of crash data into augmentation and crash risk assessment, as these characteristics vary depending on spatial and temporal conditions. Method: Therefore, this study developed a real-time crash risk model in three stages. First, crash data were clustered to define heterogeneous crash risk situations and then, key variables were derived by the ensemble and explainable artificial intelligence techniques, Boruta-SHAP. Second, augmentation of each clustered crash data was performed using oversampling techniques including Conditional Generative Adversarial Network (CGAN), which can consider each crash risk cluster’s characteristics. Finally, crash risk models were developed and compared with other crash risk models developed by using binary logistic regression model (BLM), Random Forest (RF), extreme gradient boosting (XGBoost), and Support Vector Machine (SVM). Results: The results showed that the CGAN-based XGBoost model has the best performance and the variable of the temporal speed difference at 10-minute intervals and the precipitation variable have a large impact on crash risk prediction. This paper emphasizes that crash risk characteristics must be distinguished in crash risk prediction and provides new insights into addressing the imbalance data issue within crash and non-crash datasets.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
6.40
自引率
4.90%
发文量
174
审稿时长
61 days
期刊介绍: Journal of Safety Research is an interdisciplinary publication that provides for the exchange of ideas and scientific evidence capturing studies through research in all areas of safety and health, including traffic, workplace, home, and community. This forum invites research using rigorous methodologies, encourages translational research, and engages the global scientific community through various partnerships (e.g., this outreach includes highlighting some of the latest findings from the U.S. Centers for Disease Control and Prevention).
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信