Identification of adequate sample size for conflict-based crash risk evaluation: An investigation using Bayesian hierarchical extreme value theory models

IF 12.6 1区工程技术 Q1 PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH

Analytic Methods in Accident Research Pub Date : 2023-09-01 DOI:10.1016/j.amar.2023.100281

Chuanyun Fu , Tarek Sayed

{"title":"Identification of adequate sample size for conflict-based crash risk evaluation: An investigation using Bayesian hierarchical extreme value theory models","authors":"Chuanyun Fu , Tarek Sayed","doi":"10.1016/j.amar.2023.100281","DOIUrl":null,"url":null,"abstract":"<div><p>The use of traffic conflict-based models to estimate crash risk and evaluate the safety of road locations is a popular direction for road safety analysis. However, a challenging issue of traffic conflict-based crash risk modeling is the selection of an appropriate sample size. Reliable conflict-based crash risk models typically require a large sample size which is always very difficult to collect. Further, when choosing a sample size, the bias-variance trade-off of model estimation is a constant concern. This study proposes an approach for identifying an adequate sample size for conflict-based crash risk estimation models. The appropriate sample size is determined by checking the model convergence and its goodness-of-fit. A quantitative approach for objectively testing the model goodness-of-fit is developed. Both the trace plots and the variation tendencies of Brooks-Gelman-Rubin statistics of parameter simulation chains are examined to inspect the model convergence. A graphical method is also used to check the model goodness of fit. If the model has not converged or fits poorly, then additional samples are required. The proposed method was applied to identify the adequate sample size for a Bayesian hierarchical extreme value theory (EVT) block maxima (BM) model using traffic conflict data from four signalized intersections in the city of Surrey, British Columbia. The indicator, modified time to collision (MTTC), was used to delineate traffic conflicts. A series of stationary and non-stationary Bayesian hierarchical BM models were developed using the cycle-level maximums of negated MTTC. The adequate sample sizes of stationary and non-stationary Bayesian hierarchical BM models were determined separately. Further, two methods of increasing the sample size (i.e., extending the observation period and combining data from different sites) were compared in terms of goodness-of-fit as well as crash estimate accuracy and precision. The results show that for both stationary and non-stationary models, the sample size used is adequate for model convergence and goodness-of-fit. Moreover, adding covariates to the stationary Bayesian hierarchical BM model does not affect the size of the required sample. Extending the observation period outperforms combining data from different sites in terms of goodness-of-fit as well as crash estimation accuracy and precision of non-stationary models. This is likely related to the existence of unmeasured factors that could impair model estimation and inference when merging data from several sites to augment the number of samples. Overall, the findings of this study can be applied to examine whether available data is adequate and the amount of additional data required for producing reliable statistical inference.</p></div>","PeriodicalId":47520,"journal":{"name":"Analytic Methods in Accident Research","volume":"39 ","pages":"Article 100281"},"PeriodicalIF":12.6000,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Analytic Methods in Accident Research","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213665723000167","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}

引用次数: 0

Abstract

The use of traffic conflict-based models to estimate crash risk and evaluate the safety of road locations is a popular direction for road safety analysis. However, a challenging issue of traffic conflict-based crash risk modeling is the selection of an appropriate sample size. Reliable conflict-based crash risk models typically require a large sample size which is always very difficult to collect. Further, when choosing a sample size, the bias-variance trade-off of model estimation is a constant concern. This study proposes an approach for identifying an adequate sample size for conflict-based crash risk estimation models. The appropriate sample size is determined by checking the model convergence and its goodness-of-fit. A quantitative approach for objectively testing the model goodness-of-fit is developed. Both the trace plots and the variation tendencies of Brooks-Gelman-Rubin statistics of parameter simulation chains are examined to inspect the model convergence. A graphical method is also used to check the model goodness of fit. If the model has not converged or fits poorly, then additional samples are required. The proposed method was applied to identify the adequate sample size for a Bayesian hierarchical extreme value theory (EVT) block maxima (BM) model using traffic conflict data from four signalized intersections in the city of Surrey, British Columbia. The indicator, modified time to collision (MTTC), was used to delineate traffic conflicts. A series of stationary and non-stationary Bayesian hierarchical BM models were developed using the cycle-level maximums of negated MTTC. The adequate sample sizes of stationary and non-stationary Bayesian hierarchical BM models were determined separately. Further, two methods of increasing the sample size (i.e., extending the observation period and combining data from different sites) were compared in terms of goodness-of-fit as well as crash estimate accuracy and precision. The results show that for both stationary and non-stationary models, the sample size used is adequate for model convergence and goodness-of-fit. Moreover, adding covariates to the stationary Bayesian hierarchical BM model does not affect the size of the required sample. Extending the observation period outperforms combining data from different sites in terms of goodness-of-fit as well as crash estimation accuracy and precision of non-stationary models. This is likely related to the existence of unmeasured factors that could impair model estimation and inference when merging data from several sites to augment the number of samples. Overall, the findings of this study can be applied to examine whether available data is adequate and the amount of additional data required for producing reliable statistical inference.

查看原文本刊更多论文

为基于冲突的碰撞风险评估确定足够的样本量:使用贝叶斯层次极值理论模型的调查

使用基于交通冲突的模型来估计碰撞风险和评估道路位置的安全性是道路安全分析的一个流行方向。然而，基于交通冲突的碰撞风险建模的一个具有挑战性的问题是选择合适的样本量。可靠的基于冲突的崩溃风险模型通常需要很大的样本量，这总是很难收集。此外，在选择样本量时，模型估计的偏差-方差权衡是一个经常关注的问题。本研究提出了一种为基于冲突的碰撞风险估计模型确定适当样本量的方法。通过检查模型收敛性及其拟合优度来确定适当的样本大小。提出了一种客观检验模型拟合优度的定量方法。检验了参数模拟链的Brooks-Gelman-Rubin统计量的迹图和变化趋势，以检验模型的收敛性。还使用图形方法来检查模型的拟合优度。如果模型没有收敛或拟合不好，则需要额外的样本。利用不列颠哥伦比亚省萨里市四个信号交叉口的交通冲突数据，将所提出的方法应用于确定贝叶斯分层极值理论（EVT）块最大值（BM）模型的适当样本量。该指标称为修正碰撞时间（MTTC），用于描述交通冲突。利用否定MTTC的周期级最大值，建立了一系列平稳和非平稳的贝叶斯层次BM模型。分别确定了平稳和非平稳贝叶斯层次BM模型的适当样本量。此外，在拟合优度以及碰撞估计的准确性和精度方面，对两种增加样本量的方法（即延长观测期和合并不同地点的数据）进行了比较。结果表明，对于平稳和非平稳模型，所使用的样本大小足以保证模型的收敛性和拟合优度。此外，向平稳贝叶斯分层BM模型添加协变量不会影响所需样本的大小。在拟合优度以及非平稳模型的碰撞估计精度和精度方面，延长观测周期优于组合来自不同地点的数据。这可能与在合并来自多个站点的数据以增加样本数量时，存在可能损害模型估计和推断的未测量因素有关。总的来说，这项研究的结果可以用于检查可用数据是否足够，以及产生可靠统计推断所需的额外数据量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Analytic Methods in Accident Research Multiple-

CiteScore

22.10

自引率

34.10%

发文量

审稿时长

24 days

期刊介绍： Analytic Methods in Accident Research is a journal that publishes articles related to the development and application of advanced statistical and econometric methods in studying vehicle crashes and other accidents. The journal aims to demonstrate how these innovative approaches can provide new insights into the factors influencing the occurrence and severity of accidents, thereby offering guidance for implementing appropriate preventive measures. While the journal primarily focuses on the analytic approach, it also accepts articles covering various aspects of transportation safety (such as road, pedestrian, air, rail, and water safety), construction safety, and other areas where human behavior, machine failures, or system failures lead to property damage or bodily harm.