Xinyuan Liu, Jinjun Tang, Chen Yuan, Fan Gao, Xizhi Ding
{"title":"Examining the characteristics between time and distance gaps of secondary crashes","authors":"Xinyuan Liu, Jinjun Tang, Chen Yuan, Fan Gao, Xizhi Ding","doi":"10.1093/tse/tdad014","DOIUrl":null,"url":null,"abstract":"\n Understanding the characteristics of time and distance gaps between the primary and second crashes is crucial for preventing secondary crash occurrences and improving road safety. Although previous studies have tried to analyze the variation of gaps, there is limited evidence in quantifying the relationships between different gaps and various influential factors. This study proposed a two-layer Stacking framework to discuss the time and distance gaps. Specifically, the framework took Random Forests, Gradient Boosting Decision Tree, and eXtreme Gradient Boosting as the base classifiers in the first layer and applied Logistic Regression as a combiner in the second layer. On this basis, the Local Interpretable Model-agnostic Explanations (LIME) technology was used to interpret the output of the Stacking model from both local and global perspectives. Through secondary crash identification and feature selection, 346 secondary crashes and 22 crash-related factors were collected from California interstate freeways. The results showed that the Stacking model outperformed base models evaluated by accuracy, precision, and recall indicators. The explanations based on LIME suggest that collision type, distance, speed, and volume are the critical features that affect the time and distance gaps. Higher volume can prolong queue length and increase the distance gap from the secondary to primary crashes. And collision types, peak periods, workday, truck involved, and tow away likely induce a long-distance gap. Conversely, there is a shorter distance gap when secondary roads run in the same direction and are close to the primary roads. Lower speed is a significant factor resulting in a long-time gap, while the higher speed is correlated with a short-time gap. These results are expected to provide insights into how contributory features affect the time and distance gaps and help decision-makers develop accurate decisions to prevent secondary crashes.","PeriodicalId":52804,"journal":{"name":"Transportation Safety and Environment","volume":null,"pages":null},"PeriodicalIF":2.7000,"publicationDate":"2023-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Safety and Environment","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1093/tse/tdad014","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION SCIENCE & TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding the characteristics of time and distance gaps between the primary and second crashes is crucial for preventing secondary crash occurrences and improving road safety. Although previous studies have tried to analyze the variation of gaps, there is limited evidence in quantifying the relationships between different gaps and various influential factors. This study proposed a two-layer Stacking framework to discuss the time and distance gaps. Specifically, the framework took Random Forests, Gradient Boosting Decision Tree, and eXtreme Gradient Boosting as the base classifiers in the first layer and applied Logistic Regression as a combiner in the second layer. On this basis, the Local Interpretable Model-agnostic Explanations (LIME) technology was used to interpret the output of the Stacking model from both local and global perspectives. Through secondary crash identification and feature selection, 346 secondary crashes and 22 crash-related factors were collected from California interstate freeways. The results showed that the Stacking model outperformed base models evaluated by accuracy, precision, and recall indicators. The explanations based on LIME suggest that collision type, distance, speed, and volume are the critical features that affect the time and distance gaps. Higher volume can prolong queue length and increase the distance gap from the secondary to primary crashes. And collision types, peak periods, workday, truck involved, and tow away likely induce a long-distance gap. Conversely, there is a shorter distance gap when secondary roads run in the same direction and are close to the primary roads. Lower speed is a significant factor resulting in a long-time gap, while the higher speed is correlated with a short-time gap. These results are expected to provide insights into how contributory features affect the time and distance gaps and help decision-makers develop accurate decisions to prevent secondary crashes.