A novel bi-level clustering optimization approach to balance treatment of crash data

IF 5.7 1区 工程技术 Q1 ERGONOMICS
Tanveer Ahmed, Vikash V. Gayah
{"title":"A novel bi-level clustering optimization approach to balance treatment of crash data","authors":"Tanveer Ahmed,&nbsp;Vikash V. Gayah","doi":"10.1016/j.aap.2025.108107","DOIUrl":null,"url":null,"abstract":"<div><div>Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure’s effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.</div></div>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"219 ","pages":"Article 108107"},"PeriodicalIF":5.7000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0001457525001939","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
引用次数: 0

Abstract

Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure’s effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.
一种新的双级聚类优化方法来平衡处理碰撞数据
了解安全对策对碰撞结果的影响至关重要,但也具有挑战性。当使用横截面数据来量化对策的有效性时,道路特征的潜在差异可能导致处理场地和没有对策的对照场地之间的不平衡,这可能会在评估中引入偏差。基于倾向得分的匹配方法已广泛应用于交通安全文献中,以确定具有更平衡协变量的处理和控制站点;然而,倾向分数的使用并不能保证治疗实体和控制实体之间的偏差最小化,其成功程度高度依赖于倾向分数模型的制定。为了解决这个问题,本研究引入了一种新的双水平聚类优化(BLCO)方法,以最小化两组之间的不平衡来匹配处理和控制位点。所提出的方法利用竞争性学习,专门最小化治疗组和对照组协变量标准化偏差的平方和,更好地模拟使用非随机观察数据的随机试验条件。将所提出的BLCO方法与二元logit回归的倾向评分匹配方法、随机森林算法以及遗传匹配方法进行了比较。结果表明,所提出的BLCO方法在平衡处理组和对照组的协变量方面明显优于这些基准,与未匹配的数据相比,平均绝对标准化偏差减少了96.16%,比倾向评分匹配提高了88.76%。此外,与其他方法相比,使用最优聚类数据估计的治疗效果显示出更好的模型拟合。该方法在不同数据集规模下具有鲁棒性,且无需变换即可有效处理高维协变量,适用于不同领域的治疗效果估计和知情决策。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
11.90
自引率
16.90%
发文量
264
审稿时长
48 days
期刊介绍: Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信