{"title":"Developing safety performance functions incorporating pavement roughness using Poisson regression and Machine learning models on Jordan’s Desert Highway","authors":"Hazem Al-Mahamid , Diana Al-Nabulsi , Adam Torok","doi":"10.1016/j.trip.2025.101659","DOIUrl":null,"url":null,"abstract":"<div><div>This study presents a high-resolution, hybrid model for forecasting crash frequency along a critical segment of Jordan’s Desert Highway, leveraging classical statistical inference and cutting-edge machine learning algorithms. Utilising a multi-stage approach encompassing rigorous data preprocessing, feature engineering, and multicollinearity diagnostics, the analysis integrates Poisson regression, Random Forest, XGBoost, and Support Vector Regression (SVR) to model the intricate relationships between crash occurrences and key covariates, including traffic volume (AADT), pavement roughness (IRI), vehicle speed, and driver age. Model performance was comprehensively evaluated using k-fold cross-validation and multiple diagnostic metrics (R<sup>2</sup>, RMSE, MAE, MAPE), with SVR yielding the most accurate predictions (R<sup>2</sup> = 0.983), substantially surpassing the Poisson baseline. Residual analyses confirmed the minimised bias and variance in machine learning estimators. Feature importance assessments using SHAP further underscored the dominant influence of AADT and IRI on crash likelihood. The findings establish the empirical superiority of non-parametric machine learning models in capturing non-linear, context-sensitive crash dynamics and advocate their deployment in contemporary traffic safety analysis. The study also emphasises the strategic value of granular, high-fidelity data and recommends incorporating spatiotemporal modelling and explainable artificial intelligence (XAI) to improve interpretability, generalizability, and real-time applicability in infrastructure risk management.</div></div>","PeriodicalId":36621,"journal":{"name":"Transportation Research Interdisciplinary Perspectives","volume":"34 ","pages":"Article 101659"},"PeriodicalIF":3.8000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transportation Research Interdisciplinary Perspectives","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590198225003380","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"TRANSPORTATION","Score":null,"Total":0}
引用次数: 0
Abstract
This study presents a high-resolution, hybrid model for forecasting crash frequency along a critical segment of Jordan’s Desert Highway, leveraging classical statistical inference and cutting-edge machine learning algorithms. Utilising a multi-stage approach encompassing rigorous data preprocessing, feature engineering, and multicollinearity diagnostics, the analysis integrates Poisson regression, Random Forest, XGBoost, and Support Vector Regression (SVR) to model the intricate relationships between crash occurrences and key covariates, including traffic volume (AADT), pavement roughness (IRI), vehicle speed, and driver age. Model performance was comprehensively evaluated using k-fold cross-validation and multiple diagnostic metrics (R2, RMSE, MAE, MAPE), with SVR yielding the most accurate predictions (R2 = 0.983), substantially surpassing the Poisson baseline. Residual analyses confirmed the minimised bias and variance in machine learning estimators. Feature importance assessments using SHAP further underscored the dominant influence of AADT and IRI on crash likelihood. The findings establish the empirical superiority of non-parametric machine learning models in capturing non-linear, context-sensitive crash dynamics and advocate their deployment in contemporary traffic safety analysis. The study also emphasises the strategic value of granular, high-fidelity data and recommends incorporating spatiotemporal modelling and explainable artificial intelligence (XAI) to improve interpretability, generalizability, and real-time applicability in infrastructure risk management.