碰撞频率模型估计的综合多目标框架。

IF 5.7 1区工程技术 Q1 ERGONOMICS

Accident; analysis and prevention Pub Date : 2025-02-01 DOI:10.1016/j.aap.2024.107844

Zeke Ahern , Paul Corry , Mohammadali Shirazi , Alexander Paz

{"title":"碰撞频率模型估计的综合多目标框架。","authors":"Zeke Ahern , Paul Corry , Mohammadali Shirazi , Alexander Paz","doi":"10.1016/j.aap.2024.107844","DOIUrl":null,"url":null,"abstract":"<div><div>A common and challenging data and modeling aspect in crash analysis is unobserved heterogeneity, which is often handled using random parameters and special distributions such as Lindley. Random parameters can be estimated with respect to each observation for the entire dataset, and grouped across segments of the dataset, with variable means, or variable variances. The selection of the best approach to handle unobserved heterogeneity depends on the data characteristics and requires the corresponding hypothesis testing. In addition to dealing with unobserved heterogeneity, crash frequency modeling often requires explicit consideration of functional forms, transformations, and identification of likely contributing factors. During model estimation, it is important to consider multiple objectives such as in- and out-of-sample goodness-of-fit to generate reliable and transferable insights. Taking all of these aspects and objectives into account simultaneously represents a very large number of modeling decisions and hypothesis testing. Limited testing and model development may lead to bias and missing relevant specifications with important insights. To address these challenges, this paper proposes a comprehensive optimization framework, underpinned by a mathematical programming formulation, for systematic hypothesis testing considering simultaneously multiple objectives, unobserved heterogeneity, grouped random parameters, functional forms, transformations, heterogeneity in means, and the identification of likely contributing factors. The proposed framework employs a variety of metaheuristic solution algorithms to address the complexity and non-convexity of the estimation and optimization problem. Several metaheuristics were tested including Simulated Annealing, Differential Evolution and Harmony Search. Harmony Search provided convergence with low sensitivity to the choice of hyperparameters. The effectiveness of the framework was evaluated using three real-world data sets, generating sound and consistent results compared to the corresponding published models. These results demonstrate the ability of the proposed framework to efficiently estimate sound and parsimonious crash data count models while reducing costs associated with time and required knowledge, bias, and sub-optimal solutions due to limited testing. To support experimental testing for analysts and modelers, the Python package “MetaCountRegressor,” which includes algorithms and software, is available on PyPi.</div></div>","PeriodicalId":6926,"journal":{"name":"Accident; analysis and prevention","volume":"210 ","pages":"Article 107844"},"PeriodicalIF":5.7000,"publicationDate":"2025-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comprehensive multi-objective framework for the estimation of crash frequency models\",\"authors\":\"Zeke Ahern , Paul Corry , Mohammadali Shirazi , Alexander Paz\",\"doi\":\"10.1016/j.aap.2024.107844\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>A common and challenging data and modeling aspect in crash analysis is unobserved heterogeneity, which is often handled using random parameters and special distributions such as Lindley. Random parameters can be estimated with respect to each observation for the entire dataset, and grouped across segments of the dataset, with variable means, or variable variances. The selection of the best approach to handle unobserved heterogeneity depends on the data characteristics and requires the corresponding hypothesis testing. In addition to dealing with unobserved heterogeneity, crash frequency modeling often requires explicit consideration of functional forms, transformations, and identification of likely contributing factors. During model estimation, it is important to consider multiple objectives such as in- and out-of-sample goodness-of-fit to generate reliable and transferable insights. Taking all of these aspects and objectives into account simultaneously represents a very large number of modeling decisions and hypothesis testing. Limited testing and model development may lead to bias and missing relevant specifications with important insights. To address these challenges, this paper proposes a comprehensive optimization framework, underpinned by a mathematical programming formulation, for systematic hypothesis testing considering simultaneously multiple objectives, unobserved heterogeneity, grouped random parameters, functional forms, transformations, heterogeneity in means, and the identification of likely contributing factors. The proposed framework employs a variety of metaheuristic solution algorithms to address the complexity and non-convexity of the estimation and optimization problem. Several metaheuristics were tested including Simulated Annealing, Differential Evolution and Harmony Search. Harmony Search provided convergence with low sensitivity to the choice of hyperparameters. The effectiveness of the framework was evaluated using three real-world data sets, generating sound and consistent results compared to the corresponding published models. These results demonstrate the ability of the proposed framework to efficiently estimate sound and parsimonious crash data count models while reducing costs associated with time and required knowledge, bias, and sub-optimal solutions due to limited testing. To support experimental testing for analysts and modelers, the Python package “MetaCountRegressor,” which includes algorithms and software, is available on PyPi.</div></div>\",\"PeriodicalId\":6926,\"journal\":{\"name\":\"Accident; analysis and prevention\",\"volume\":\"210 \",\"pages\":\"Article 107844\"},\"PeriodicalIF\":5.7000,\"publicationDate\":\"2025-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Accident; analysis and prevention\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0001457524003890\",\"RegionNum\":1,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ERGONOMICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Accident; analysis and prevention","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0001457524003890","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ERGONOMICS","Score":null,"Total":0}

引用次数: 0

摘要

在崩溃分析中，一个常见且具有挑战性的数据和建模方面是未观察到的异质性，这通常使用随机参数和特殊分布（如Lindley）来处理。随机参数可以相对于整个数据集的每个观测值进行估计，并在数据集的各个部分进行分组，具有可变均值或可变方差。处理未观察到的异质性的最佳方法的选择取决于数据的特征，并需要相应的假设检验。除了处理未观察到的异质性之外，碰撞频率建模通常需要明确考虑功能形式、转换和识别可能的促成因素。在模型估计期间，重要的是要考虑多个目标，例如样本内和样本外的拟合优度，以生成可靠和可转移的见解。同时考虑所有这些方面和目标意味着大量的建模决策和假设检验。有限的测试和模型开发可能会导致偏差和丢失具有重要见解的相关规范。为了应对这些挑战，本文提出了一个以数学规划公式为基础的综合优化框架，用于同时考虑多目标、未观察到的异质性、分组随机参数、功能形式、转换、均值异质性以及可能影响因素的识别的系统性假设检验。该框架采用多种元启发式求解算法来解决估计和优化问题的复杂性和非凸性。对模拟退火法、差分进化法和和谐搜索法进行了检验。和谐搜索对超参数的选择提供了低灵敏度的收敛性。使用三个真实世界的数据集评估了该框架的有效性，与相应的已发表模型相比，产生了可靠且一致的结果。这些结果表明，所提出的框架能够有效地估计健全和简洁的碰撞数据计数模型，同时减少与时间和所需知识相关的成本、偏差和由于有限测试而导致的次优解决方案。为了支持分析师和建模师的实验测试，Python包“MetaCountRegressor”包括算法和软件，可以在PyPi上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A comprehensive multi-objective framework for the estimation of crash frequency models

A common and challenging data and modeling aspect in crash analysis is unobserved heterogeneity, which is often handled using random parameters and special distributions such as Lindley. Random parameters can be estimated with respect to each observation for the entire dataset, and grouped across segments of the dataset, with variable means, or variable variances. The selection of the best approach to handle unobserved heterogeneity depends on the data characteristics and requires the corresponding hypothesis testing. In addition to dealing with unobserved heterogeneity, crash frequency modeling often requires explicit consideration of functional forms, transformations, and identification of likely contributing factors. During model estimation, it is important to consider multiple objectives such as in- and out-of-sample goodness-of-fit to generate reliable and transferable insights. Taking all of these aspects and objectives into account simultaneously represents a very large number of modeling decisions and hypothesis testing. Limited testing and model development may lead to bias and missing relevant specifications with important insights. To address these challenges, this paper proposes a comprehensive optimization framework, underpinned by a mathematical programming formulation, for systematic hypothesis testing considering simultaneously multiple objectives, unobserved heterogeneity, grouped random parameters, functional forms, transformations, heterogeneity in means, and the identification of likely contributing factors. The proposed framework employs a variety of metaheuristic solution algorithms to address the complexity and non-convexity of the estimation and optimization problem. Several metaheuristics were tested including Simulated Annealing, Differential Evolution and Harmony Search. Harmony Search provided convergence with low sensitivity to the choice of hyperparameters. The effectiveness of the framework was evaluated using three real-world data sets, generating sound and consistent results compared to the corresponding published models. These results demonstrate the ability of the proposed framework to efficiently estimate sound and parsimonious crash data count models while reducing costs associated with time and required knowledge, bias, and sub-optimal solutions due to limited testing. To support experimental testing for analysts and modelers, the Python package “MetaCountRegressor,” which includes algorithms and software, is available on PyPi.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Accident; analysis and prevention Multiple-

CiteScore

11.90

自引率

16.90%

发文量

264

审稿时长

48 days

期刊介绍： Accident Analysis & Prevention provides wide coverage of the general areas relating to accidental injury and damage, including the pre-injury and immediate post-injury phases. Published papers deal with medical, legal, economic, educational, behavioral, theoretical or empirical aspects of transportation accidents, as well as with accidents at other sites. Selected topics within the scope of the Journal may include: studies of human, environmental and vehicular factors influencing the occurrence, type and severity of accidents and injury; the design, implementation and evaluation of countermeasures; biomechanics of impact and human tolerance limits to injury; modelling and statistical analysis of accident data; policy, planning and decision-making in safety.