{"title":"A novel bi-level clustering optimization approach to balance treatment of crash data","authors":"Tanveer Ahmed, Vikash V. Gayah","doi":"10.1016/j.aap.2025.108107","DOIUrl":null,"url":null,"abstract":"<div><div>Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure’s effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.</div></div>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"219 ","pages":"Article 108107"},"PeriodicalIF":5.7000,"publicationDate":"2025-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0001457525001939","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
Understanding the impact of safety countermeasures on crash outcomes is crucial but challenging. When using cross-sectional data to quantify a countermeasure’s effectiveness, underlying differences in road characteristics can lead to imbalances between treated sites and control sites that do not have the countermeasure, which can introduce bias into the evaluation. Propensity score-based matching methods have been widely used in the traffic safety literature to identify treated and control sites with more balanced covariates; however, the use of propensity scores does not guarantee bias between treated and control entities is minimized and its success is highly dependent on propensity score model formulation. To address this issue, this study introduces a novel Bi-Level Clustering Optimization (BLCO) method to match treated and control sites in a way that minimizes imbalance across the two groups. The proposed method utilizes competitive learning to specifically minimize the sum of squares of standardized bias of covariates across the treated and control groups, better simulating the conditions of a randomized trial using non-random observational data. The proposed BLCO method was compared to propensity score matching methods using binary logit regression, random forest algorithms, as well as the genetic matching method. The results demonstrate that the proposed BLCO method significantly outperforms these benchmarks at balancing covariates across treated and control groups, reducing mean absolute standardized bias by 96.16% compared to the unmatched data and achieving an 88.76% improvement over propensity score matching. Additionally, treatment effects of the treated estimated using optimally clustered data showed better model fit compared to the other methods. The proposed method is robust across varying dataset sizes and efficiently handles high-dimensional covariates without transformation, making it applicable to different domains for treatment effect estimation and informed decision-making.
期刊介绍:
Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.