{"title":"Machine learning for occupational accident analysis: Applications, challenges, and future directions","authors":"Izuchukwu Chukwuma Obasi, Pericles Cheng, Cleo Varianou-Mikellidou, Christos Dimopoulos, Georgios Boustras","doi":"10.1016/j.jnlssr.2025.100250","DOIUrl":null,"url":null,"abstract":"<div><div>Machine learning (ML) drives progress in occupational accident prevention across diverse sectors. However, significant challenges persist in aligning these tools with practical safety needs, including accurate risk assessment, incident prediction, and targeted prevention strategies. While prior reviews focused narrowly on specific industries or data types, this study presents a comprehensive analysis of ML models in accident analysis, categorizing them by accident type, industry application, and modeling methodology. This study addresses critical challenges in ML model development—such as data quality, hyperparameter tuning, and managing class imbalances—and examines less-discussed topics, including explanatory variable selection and strategies for mitigating overfitting. This review thoroughly assesses the current state of ML-based accident prediction, highlighting critical gaps, methodological limitations, and potential research directions. By analyzing 504 studies across three perspectives—Accident Type, Industry Application, and Modeling Methodology—this review identifies pressing challenges, including (1) limitations in data quality and availability, especially for real-time sources; (2) inadequate model interpretability across applications; (3) difficulties in handling imbalanced accident datasets; and (4) the lack of an integrated framework for incorporating proactive data and industry-specific risk factors. The findings outline a roadmap for advancing ML in occupational safety by enhancing model robustness, improving interpretability, and expanding data sources. This review aims to better align ML applications with safety objectives, promoting data-driven approaches for effective accident analysis and prevention across industries.</div></div>","PeriodicalId":62710,"journal":{"name":"安全科学与韧性(英文)","volume":"7 1","pages":"Article 100250"},"PeriodicalIF":3.4000,"publicationDate":"2025-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"安全科学与韧性(英文)","FirstCategoryId":"1087","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666449625000842","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Machine learning (ML) drives progress in occupational accident prevention across diverse sectors. However, significant challenges persist in aligning these tools with practical safety needs, including accurate risk assessment, incident prediction, and targeted prevention strategies. While prior reviews focused narrowly on specific industries or data types, this study presents a comprehensive analysis of ML models in accident analysis, categorizing them by accident type, industry application, and modeling methodology. This study addresses critical challenges in ML model development—such as data quality, hyperparameter tuning, and managing class imbalances—and examines less-discussed topics, including explanatory variable selection and strategies for mitigating overfitting. This review thoroughly assesses the current state of ML-based accident prediction, highlighting critical gaps, methodological limitations, and potential research directions. By analyzing 504 studies across three perspectives—Accident Type, Industry Application, and Modeling Methodology—this review identifies pressing challenges, including (1) limitations in data quality and availability, especially for real-time sources; (2) inadequate model interpretability across applications; (3) difficulties in handling imbalanced accident datasets; and (4) the lack of an integrated framework for incorporating proactive data and industry-specific risk factors. The findings outline a roadmap for advancing ML in occupational safety by enhancing model robustness, improving interpretability, and expanding data sources. This review aims to better align ML applications with safety objectives, promoting data-driven approaches for effective accident analysis and prevention across industries.