Dan Wu , Lu Xing , Ye Li , Yiik Diew Wong , Jaeyoung Jay Lee , Changyin Dong
{"title":"A framework for real-time traffic risk prediction incorporating cost-sensitive learning and dynamic thresholds","authors":"Dan Wu , Lu Xing , Ye Li , Yiik Diew Wong , Jaeyoung Jay Lee , Changyin Dong","doi":"10.1016/j.aap.2025.108087","DOIUrl":null,"url":null,"abstract":"<div><div>In recent years, researchers have explored an innovative approach that leverages real vehicle trajectory data to simultaneously derive traffic state and risk level for real-time risk prediction, which is crucial for traffic safety. However, existing studies largely overlook the costs associated with incorrect predictions and the varying consequences of different misclassifications, which undermines the reliability of the obtained prediction results. To address these gaps, this study refined traffic risk classification into four levels (i.e., no, low, medium, and high risks) and incorporated misclassification costs into the prediction process through cost-sensitive learning (CSL). Furthermore, considering that multi-class prediction tasks often face performance degradation and increased risk level granularity worsens class imbalance, further amplifying this degradation, this study introduced dynamic thresholds (DTs) to improve model performance. The aforementioned cost coefficients and thresholds were pinpointed using a genetic algorithm (GA). Furthermore, the employed data, comprising variables related to traffic state and associated risk data, were sourced from the HighD dataset. Subsequently, CSL-DTs-based models were built by integrating CSL and DTs with four distinct baseline machine/deep learning models, and the prediction performance (e.g., precision) and computation time of these models were compared. Results show that, compared to the corresponding baseline models, the proposed models perform better for multi-class prediction tasks. Additionally, the computation time of the CSL-DTs-based models is found to be acceptable for real-time prediction purposes. Finally, to ensure the reliability of the results obtained through the GA optimization (e.g., avoiding local optima), convergence curves were plotted, confirming the robustness of the optimization process. A robustness analysis also demonstrates that the models are highly stable under slight perturbations of cost coefficients and thresholds, with minimal impact on performance. Findings of this study are expected to enhance the reliability of real-time traffic risk prediction, holding the promise of significantly promoting proactive traffic safety management.</div></div>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"218 ","pages":"Article 108087"},"PeriodicalIF":5.7000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0001457525001733","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, researchers have explored an innovative approach that leverages real vehicle trajectory data to simultaneously derive traffic state and risk level for real-time risk prediction, which is crucial for traffic safety. However, existing studies largely overlook the costs associated with incorrect predictions and the varying consequences of different misclassifications, which undermines the reliability of the obtained prediction results. To address these gaps, this study refined traffic risk classification into four levels (i.e., no, low, medium, and high risks) and incorporated misclassification costs into the prediction process through cost-sensitive learning (CSL). Furthermore, considering that multi-class prediction tasks often face performance degradation and increased risk level granularity worsens class imbalance, further amplifying this degradation, this study introduced dynamic thresholds (DTs) to improve model performance. The aforementioned cost coefficients and thresholds were pinpointed using a genetic algorithm (GA). Furthermore, the employed data, comprising variables related to traffic state and associated risk data, were sourced from the HighD dataset. Subsequently, CSL-DTs-based models were built by integrating CSL and DTs with four distinct baseline machine/deep learning models, and the prediction performance (e.g., precision) and computation time of these models were compared. Results show that, compared to the corresponding baseline models, the proposed models perform better for multi-class prediction tasks. Additionally, the computation time of the CSL-DTs-based models is found to be acceptable for real-time prediction purposes. Finally, to ensure the reliability of the results obtained through the GA optimization (e.g., avoiding local optima), convergence curves were plotted, confirming the robustness of the optimization process. A robustness analysis also demonstrates that the models are highly stable under slight perturbations of cost coefficients and thresholds, with minimal impact on performance. Findings of this study are expected to enhance the reliability of real-time traffic risk prediction, holding the promise of significantly promoting proactive traffic safety management.
期刊介绍:
Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.