Zeke Ahern, Paul Corry, Mohammadali Shirazi, Alexander Paz
{"title":"A comprehensive multi-objective framework for the estimation of crash frequency models.","authors":"Zeke Ahern, Paul Corry, Mohammadali Shirazi, Alexander Paz","doi":"10.1016/j.aap.2024.107844","DOIUrl":null,"url":null,"abstract":"<p><p>A common and challenging data and modeling aspect in crash analysis is unobserved heterogeneity, which is often handled using random parameters and special distributions such as Lindley. Random parameters can be estimated with respect to each observation for the entire dataset, and grouped across segments of the dataset, with variable means, or variable variances. The selection of the best approach to handle unobserved heterogeneity depends on the data characteristics and requires the corresponding hypothesis testing. In addition to dealing with unobserved heterogeneity, crash frequency modeling often requires explicit consideration of functional forms, transformations, and identification of likely contributing factors. During model estimation, it is important to consider multiple objectives such as in- and out-of-sample goodness-of-fit to generate reliable and transferable insights. Taking all of these aspects and objectives into account simultaneously represents a very large number of modeling decisions and hypothesis testing. Limited testing and model development may lead to bias and missing relevant specifications with important insights. To address these challenges, this paper proposes a comprehensive optimization framework, underpinned by a mathematical programming formulation, for systematic hypothesis testing considering simultaneously multiple objectives, unobserved heterogeneity, grouped random parameters, functional forms, transformations, heterogeneity in means, and the identification of likely contributing factors. The proposed framework employs a variety of metaheuristic solution algorithms to address the complexity and non-convexity of the estimation and optimization problem. Several metaheuristics were tested including Simulated Annealing, Differential Evolution and Harmony Search. Harmony Search provided convergence with low sensitivity to the choice of hyperparameters. The effectiveness of the framework was evaluated using three real-world data sets, generating sound and consistent results compared to the corresponding published models. These results demonstrate the ability of the proposed framework to efficiently estimate sound and parsimonious crash data count models while reducing costs associated with time and required knowledge, bias, and sub-optimal solutions due to limited testing. To support experimental testing for analysts and modelers, the Python package \"MetaCountRegressor,\" which includes algorithms and software, is available on PyPi.</p>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"210 ","pages":"107844"},"PeriodicalIF":5.7000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.aap.2024.107844","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/12/2 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
A common and challenging data and modeling aspect in crash analysis is unobserved heterogeneity, which is often handled using random parameters and special distributions such as Lindley. Random parameters can be estimated with respect to each observation for the entire dataset, and grouped across segments of the dataset, with variable means, or variable variances. The selection of the best approach to handle unobserved heterogeneity depends on the data characteristics and requires the corresponding hypothesis testing. In addition to dealing with unobserved heterogeneity, crash frequency modeling often requires explicit consideration of functional forms, transformations, and identification of likely contributing factors. During model estimation, it is important to consider multiple objectives such as in- and out-of-sample goodness-of-fit to generate reliable and transferable insights. Taking all of these aspects and objectives into account simultaneously represents a very large number of modeling decisions and hypothesis testing. Limited testing and model development may lead to bias and missing relevant specifications with important insights. To address these challenges, this paper proposes a comprehensive optimization framework, underpinned by a mathematical programming formulation, for systematic hypothesis testing considering simultaneously multiple objectives, unobserved heterogeneity, grouped random parameters, functional forms, transformations, heterogeneity in means, and the identification of likely contributing factors. The proposed framework employs a variety of metaheuristic solution algorithms to address the complexity and non-convexity of the estimation and optimization problem. Several metaheuristics were tested including Simulated Annealing, Differential Evolution and Harmony Search. Harmony Search provided convergence with low sensitivity to the choice of hyperparameters. The effectiveness of the framework was evaluated using three real-world data sets, generating sound and consistent results compared to the corresponding published models. These results demonstrate the ability of the proposed framework to efficiently estimate sound and parsimonious crash data count models while reducing costs associated with time and required knowledge, bias, and sub-optimal solutions due to limited testing. To support experimental testing for analysts and modelers, the Python package "MetaCountRegressor," which includes algorithms and software, is available on PyPi.
期刊介绍:
Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.