{"title":"基于半数据驱动贝叶斯网络的公路碰撞分析因果关系发现。","authors":"Yifan Wang, Xuesong Wang","doi":"10.1016/j.aap.2025.108181","DOIUrl":null,"url":null,"abstract":"<p><p>With the widespread application of advanced machine learning techniques, researchers need a more transparent decision-making process. The data-driven causal relationship discovery techniques often lack interpretability. Therefore, a semi-data-driven Bayesian network structure learning algorithm, the Expert Knowledge Constraint-based (EKC) algorithm, is proposed. By integrating expert knowledge with conditional independence tests, the EKC algorithm constructs a causal Bayesian network with a high level of interpretability. The algorithm was applied to a highway safety scene using crash data collected in 2022 from the HuNing Highway in China. The effects of the Bayesian network on variables were estimated using the Bayesian estimation algorithm, and the most dangerous scenarios were ranked using the variable elimination algorithm. Key findings include: (1) date-related variables do not directly affect crashes; (2) unfavorable temperatures, medium-level traffic volumes, and snowy weather conditions are associated with higher crash probabilities; and (3) the highest crash probability occurs under medium traffic volume, cold temperatures, winter season, cloudy weather, morning hours, and weekdays. The EKC algorithm was compared with the Hill Climbing algorithm, Chow-Liu Trees algorithm, and logistic model, demonstrating significant improvements in interpretability while maintaining good fitting scores. Furthermore, the definition framework of model interpretability in traffic crash analytics was discussed, including causality, trust, heterogeneity, transferability, and stability.</p>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"221 ","pages":"108181"},"PeriodicalIF":6.2000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Causal relationship discovery for highway crash analysis using semi-data-driven Bayesian network.\",\"authors\":\"Yifan Wang, Xuesong Wang\",\"doi\":\"10.1016/j.aap.2025.108181\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>With the widespread application of advanced machine learning techniques, researchers need a more transparent decision-making process. The data-driven causal relationship discovery techniques often lack interpretability. Therefore, a semi-data-driven Bayesian network structure learning algorithm, the Expert Knowledge Constraint-based (EKC) algorithm, is proposed. By integrating expert knowledge with conditional independence tests, the EKC algorithm constructs a causal Bayesian network with a high level of interpretability. The algorithm was applied to a highway safety scene using crash data collected in 2022 from the HuNing Highway in China. The effects of the Bayesian network on variables were estimated using the Bayesian estimation algorithm, and the most dangerous scenarios were ranked using the variable elimination algorithm. Key findings include: (1) date-related variables do not directly affect crashes; (2) unfavorable temperatures, medium-level traffic volumes, and snowy weather conditions are associated with higher crash probabilities; and (3) the highest crash probability occurs under medium traffic volume, cold temperatures, winter season, cloudy weather, morning hours, and weekdays. The EKC algorithm was compared with the Hill Climbing algorithm, Chow-Liu Trees algorithm, and logistic model, demonstrating significant improvements in interpretability while maintaining good fitting scores. Furthermore, the definition framework of model interpretability in traffic crash analytics was discussed, including causality, trust, heterogeneity, transferability, and stability.</p>\",\"PeriodicalId\":6926,\"journal\":{\"name\":\"Accident; analysis and prevention\",\"volume\":\"221 \",\"pages\":\"108181\"},\"PeriodicalIF\":6.2000,\"publicationDate\":\"2025-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accident; analysis and prevention\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1016/j.aap.2025.108181\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/8/7 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"ERGONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.aap.2025.108181","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/8/7 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
Causal relationship discovery for highway crash analysis using semi-data-driven Bayesian network.
With the widespread application of advanced machine learning techniques, researchers need a more transparent decision-making process. The data-driven causal relationship discovery techniques often lack interpretability. Therefore, a semi-data-driven Bayesian network structure learning algorithm, the Expert Knowledge Constraint-based (EKC) algorithm, is proposed. By integrating expert knowledge with conditional independence tests, the EKC algorithm constructs a causal Bayesian network with a high level of interpretability. The algorithm was applied to a highway safety scene using crash data collected in 2022 from the HuNing Highway in China. The effects of the Bayesian network on variables were estimated using the Bayesian estimation algorithm, and the most dangerous scenarios were ranked using the variable elimination algorithm. Key findings include: (1) date-related variables do not directly affect crashes; (2) unfavorable temperatures, medium-level traffic volumes, and snowy weather conditions are associated with higher crash probabilities; and (3) the highest crash probability occurs under medium traffic volume, cold temperatures, winter season, cloudy weather, morning hours, and weekdays. The EKC algorithm was compared with the Hill Climbing algorithm, Chow-Liu Trees algorithm, and logistic model, demonstrating significant improvements in interpretability while maintaining good fitting scores. Furthermore, the definition framework of model interpretability in traffic crash analytics was discussed, including causality, trust, heterogeneity, transferability, and stability.
期刊介绍:
Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.