A novel bi-level clustering optimization approach to balance treatment of crash data

IF 5.7 1区工程技术 Q1 ERGONOMICS

Accident; analysis and prevention Pub Date : 2025-05-17 DOI:10.1016/j.aap.2025.108107

Tanveer Ahmed, Vikash V. Gayah

{"title":"A novel bi-level clustering optimization approach to balance treatment of crash data","authors":"Tanveer Ahmed, Vikash V. Gayah","doi":"10.1016/j.aap.2025.108107","DOIUrl":null,"url":null,"abstract":"<div><div>Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure’s effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.</div></div>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"219 ","pages":"Article 108107"},"PeriodicalIF":5.7000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0001457525001939","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}

引用次数: 0

Abstract

Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure’s effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.

查看原文本刊更多论文

一种新的双级聚类优化方法来平衡处理碰撞数据

了解安全对策对碰撞结果的影响至关重要，但也具有挑战性。当使用横截面数据来量化对策的有效性时，道路特征的潜在差异可能导致处理场地和没有对策的对照场地之间的不平衡，这可能会在评估中引入偏差。基于倾向得分的匹配方法已广泛应用于交通安全文献中，以确定具有更平衡协变量的处理和控制站点；然而，倾向分数的使用并不能保证治疗实体和控制实体之间的偏差最小化，其成功程度高度依赖于倾向分数模型的制定。为了解决这个问题，本研究引入了一种新的双水平聚类优化（BLCO）方法，以最小化两组之间的不平衡来匹配处理和控制位点。所提出的方法利用竞争性学习，专门最小化治疗组和对照组协变量标准化偏差的平方和，更好地模拟使用非随机观察数据的随机试验条件。将所提出的BLCO方法与二元logit回归的倾向评分匹配方法、随机森林算法以及遗传匹配方法进行了比较。结果表明，所提出的BLCO方法在平衡处理组和对照组的协变量方面明显优于这些基准，与未匹配的数据相比，平均绝对标准化偏差减少了96.16%，比倾向评分匹配提高了88.76%。此外，与其他方法相比，使用最优聚类数据估计的治疗效果显示出更好的模型拟合。该方法在不同数据集规模下具有鲁棒性，且无需变换即可有效处理高维协变量，适用于不同领域的治疗效果估计和知情决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Accident; analysis and prevention Multiple-

CiteScore

11.90

自引率

16.90%

发文量

264

审稿时长

48 days

期刊介绍： Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.