ML-predicted surgical site infections: An epidemiological study utilizing machine learning on routinely collected healthcare data to predict infection risk
Davide Golinelli , Simona Rosa , Paola Rucci , Francesco Sanmarchi , Dario Tedesco , Carlo Biagetti , Alessio Gili , Andrea Bucci , Luca Romeo , Roberto Grilli
{"title":"ML-predicted surgical site infections: An epidemiological study utilizing machine learning on routinely collected healthcare data to predict infection risk","authors":"Davide Golinelli , Simona Rosa , Paola Rucci , Francesco Sanmarchi , Dario Tedesco , Carlo Biagetti , Alessio Gili , Andrea Bucci , Luca Romeo , Roberto Grilli","doi":"10.1016/j.smhl.2025.100596","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Surgical site infections (SSIs) are a major public health issue, causing increased morbidity, longer hospital stays, and higher healthcare costs. Despite progress in infection control, predicting and preventing SSIs remain crucial for improving patient outcomes. This study examines the use of machine learning (ML) on routinely collected healthcare data (RCD) to predict SSIs in orthopaedic surgery, aiming to improve risk stratification and guide interventions.</div></div><div><h3>Objectives</h3><div>To develop, test, and validate an ML predictive model using RCD to assess SSI risk in orthopaedic surgery patients.</div></div><div><h3>Methods</h3><div>A retrospective study was carried out using RCD from a 1.2 million population in an Italian Local Health Authority, covering surgeries from 2017 to 2021. The population included patients undergoing hip or knee arthroplasty and open reduction of fractures. Several ML algorithms, including eXtreme Gradient Boosting (XGBoost), were used for model development. The models’ performance was assessed by recall, accuracy, and area under the receiver operating characteristic curve (AUC). A feature importance analysis identified key SSI risk predictors.</div></div><div><h3>Results</h3><div>The XGBoost model demonstrated superior performance, with a recall exceeding 70% and an AUC>0.70, overcoming other methods. Significant predictors included the ASA classification, opioid use, priority class of the surgery operation, and length of hospital stay.</div></div><div><h3>Conclusions</h3><div>ML models, particularly XGBoost, effectively predicted SSI risk in orthopaedic patients, offering a new approach to infection control and prevention. Incorporating ML and RCD highlights the potential for scalable, data-driven personalized medicine interventions. Future research will focus on model validation and integration of these tools into healthcare systems for enhanced patient management.</div></div>","PeriodicalId":37151,"journal":{"name":"Smart Health","volume":"37 ","pages":"Article 100596"},"PeriodicalIF":0.0000,"publicationDate":"2025-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Smart Health","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352648325000571","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Health Professions","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Surgical site infections (SSIs) are a major public health issue, causing increased morbidity, longer hospital stays, and higher healthcare costs. Despite progress in infection control, predicting and preventing SSIs remain crucial for improving patient outcomes. This study examines the use of machine learning (ML) on routinely collected healthcare data (RCD) to predict SSIs in orthopaedic surgery, aiming to improve risk stratification and guide interventions.
Objectives
To develop, test, and validate an ML predictive model using RCD to assess SSI risk in orthopaedic surgery patients.
Methods
A retrospective study was carried out using RCD from a 1.2 million population in an Italian Local Health Authority, covering surgeries from 2017 to 2021. The population included patients undergoing hip or knee arthroplasty and open reduction of fractures. Several ML algorithms, including eXtreme Gradient Boosting (XGBoost), were used for model development. The models’ performance was assessed by recall, accuracy, and area under the receiver operating characteristic curve (AUC). A feature importance analysis identified key SSI risk predictors.
Results
The XGBoost model demonstrated superior performance, with a recall exceeding 70% and an AUC>0.70, overcoming other methods. Significant predictors included the ASA classification, opioid use, priority class of the surgery operation, and length of hospital stay.
Conclusions
ML models, particularly XGBoost, effectively predicted SSI risk in orthopaedic patients, offering a new approach to infection control and prevention. Incorporating ML and RCD highlights the potential for scalable, data-driven personalized medicine interventions. Future research will focus on model validation and integration of these tools into healthcare systems for enhanced patient management.