Integrating machine learning and extreme value theory for estimating crash frequency-by-severity via AI-based video analytics

IF 14.5 Q1 TRANSPORTATION

Communications in Transportation Research Pub Date : 2024-11-14 DOI:10.1016/j.commtr.2024.100147

Fizza Hussain , Yuefeng Li , Md Mazharul Haque

{"title":"Integrating machine learning and extreme value theory for estimating crash frequency-by-severity via AI-based video analytics","authors":"Fizza Hussain , Yuefeng Li , Md Mazharul Haque","doi":"10.1016/j.commtr.2024.100147","DOIUrl":null,"url":null,"abstract":"<div><div>Traffic conflict techniques rely heavily on the proper identification of conflict extremes, which directly affects the prediction performance of extreme value models. Two sampling techniques, namely, block maxima and peak over threshold, form the core of these models. Several studies have demonstrated the inefficacy of extreme value models based on these sampling approaches, as their crash estimates are too imprecise, hindering their widespread practical use. Recently, anomaly detection techniques for sampling conflict extremes have been used, but their application has been limited to estimating crash frequency without considering the crash severity aspect. To address this research gap, this study proposes a hybrid model of machine learning and extreme value theory within a bivariate framework of traffic conflict measures to estimate crash frequency by severity level. In particular, modified time-to-collision (MTTC) and expected post-collision change in velocity (Delta-<em>V</em> or Δ<em>V</em>) have been proposed in the hybrid modeling framework to estimate rear-end crash frequency by severity level. Rear-end conflicts were identified through artificial intelligence-based video analytics for three four-legged signalized intersections in Brisbane, Australia, using four days of data. Non-stationary bivariate hybrid generalized extreme value models with different anomaly detection/sampling techniques (isolation forest and minimum covariance determinant) were developed. The non-stationarity of traffic conflict extremes was handled by parameterizing model parameters, including location, scale, and both location and scale parameters simultaneously. The results indicate that the bivariate hybrid models can estimate severe and non-severe crashes when compared with historical crash records, thereby demonstrating the viability of the proposed approach. A comparative analysis of two anomaly techniques reveals that the isolation forest model marginally outperforms the minimum covariance determinant model. Overall, the modeling framework presented in this study advances conflict-based safety assessment, where the severity dimension can be captured via bivariate hybrid models.</div></div>","PeriodicalId":100292,"journal":{"name":"Communications in Transportation Research","volume":"4 ","pages":"Article 100147"},"PeriodicalIF":14.5000,"publicationDate":"2024-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Communications in Transportation Research","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772424724000301","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TRANSPORTATION","Score":null,"Total":0}

引用次数: 0

Abstract

Traffic conflict techniques rely heavily on the proper identification of conflict extremes, which directly affects the prediction performance of extreme value models. Two sampling techniques, namely, block maxima and peak over threshold, form the core of these models. Several studies have demonstrated the inefficacy of extreme value models based on these sampling approaches, as their crash estimates are too imprecise, hindering their widespread practical use. Recently, anomaly detection techniques for sampling conflict extremes have been used, but their application has been limited to estimating crash frequency without considering the crash severity aspect. To address this research gap, this study proposes a hybrid model of machine learning and extreme value theory within a bivariate framework of traffic conflict measures to estimate crash frequency by severity level. In particular, modified time-to-collision (MTTC) and expected post-collision change in velocity (Delta-V or ΔV) have been proposed in the hybrid modeling framework to estimate rear-end crash frequency by severity level. Rear-end conflicts were identified through artificial intelligence-based video analytics for three four-legged signalized intersections in Brisbane, Australia, using four days of data. Non-stationary bivariate hybrid generalized extreme value models with different anomaly detection/sampling techniques (isolation forest and minimum covariance determinant) were developed. The non-stationarity of traffic conflict extremes was handled by parameterizing model parameters, including location, scale, and both location and scale parameters simultaneously. The results indicate that the bivariate hybrid models can estimate severe and non-severe crashes when compared with historical crash records, thereby demonstrating the viability of the proposed approach. A comparative analysis of two anomaly techniques reveals that the isolation forest model marginally outperforms the minimum covariance determinant model. Overall, the modeling framework presented in this study advances conflict-based safety assessment, where the severity dimension can be captured via bivariate hybrid models.

查看原文本刊更多论文

整合机器学习和极值理论，通过基于人工智能的视频分析，按严重程度估算碰撞频率

交通冲突技术在很大程度上依赖于冲突极值的正确识别，这直接影响到极值模型的预测性能。两种取样技术，即街区最大值和峰值超过阈值，构成了这些模型的核心。一些研究表明，基于这些抽样方法的极值模型效果不佳，因为其碰撞估计值过于不精确，阻碍了其广泛的实际应用。最近，人们开始使用异常检测技术对冲突极值进行采样，但其应用仅限于估算碰撞频率，而没有考虑碰撞严重性方面。针对这一研究空白，本研究在交通冲突测量的双变量框架内提出了一种机器学习和极值理论的混合模型，用于按严重程度估算碰撞频率。特别是，在混合模型框架中提出了修正碰撞时间（MTTC）和预期碰撞后速度变化（Delta-V 或 ΔV），用于按严重程度估算追尾碰撞频率。通过基于人工智能的视频分析，利用四天的数据对澳大利亚布里斯班的三个四脚信号灯交叉路口的追尾冲突进行了识别。利用不同的异常检测/采样技术（隔离林和最小协方差行列式）开发了非稳态双变量混合广义极值模型。交通冲突极值的非平稳性是通过对模型参数进行参数化处理的，包括位置参数、规模参数，以及同时对位置参数和规模参数进行参数化处理。结果表明，与历史碰撞记录相比，双变量混合模型可以估算严重和非严重碰撞事故，从而证明了所提方法的可行性。对两种异常技术的比较分析表明，隔离林模型略优于最小协方差行列式模型。总体而言，本研究提出的建模框架推进了基于冲突的安全评估，其中严重性维度可通过二元混合模型来捕捉。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Communications in Transportation Research

CiteScore

15.20

自引率

0.00%

发文量