Evaluating Predictive Performance and Generalizability of Traditional and Artificial Intelligence Models in Predicting Surgical Site Infections Post-Spinal Surgery: A Systematic Review.
Laura C M Ndjonko, Aritra Chakraborty, Francesco Petri, Seyed Mohammad Amin Alavi, Takahiro Matsuo, Fabio Borgonovo, Isin Y Comba, Mohammad H Murad, Ahmad Nassr, Said El-Zein, Elie F Berbari
{"title":"Evaluating Predictive Performance and Generalizability of Traditional and Artificial Intelligence Models in Predicting Surgical Site Infections Post-Spinal Surgery: A Systematic Review.","authors":"Laura C M Ndjonko, Aritra Chakraborty, Francesco Petri, Seyed Mohammad Amin Alavi, Takahiro Matsuo, Fabio Borgonovo, Isin Y Comba, Mohammad H Murad, Ahmad Nassr, Said El-Zein, Elie F Berbari","doi":"10.1016/j.spinee.2025.07.032","DOIUrl":null,"url":null,"abstract":"<p><strong>Background context: </strong>Surgical site infections (SSIs) are a significant complication following spinal surgery. These infections contribute to increased morbidity, prolonged hospital stays, and substantial healthcare costs. Traditional statistical models have been widely used to predict SSI risk, but artificial intelligence (AI) and its machine learning (ML) methods have also been used for SSI prediction.</p><p><strong>Purpose: </strong>This systematic review aims to evaluate the predictive accuracy of AI models versus traditional statistical models in assessing SSI risk following spinal surgery.</p><p><strong>Study design/setting: </strong>A systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.</p><p><strong>Methods: </strong>We searched Medline, Embase, Scopus, Web of Science, and ClinicalTrials.gov. Studies were included if they developed predictive models for SSI following spinal surgery using either AI or traditional statistical approaches. Risk of Bias for all studies was assessed using the Prediction model Risk of Bias Assessment Tool (PROBAST). Predictive model performance was compared using metrics such as the C-statistic and Area under the Receiver Operating Characteristic curve (AUC-ROC).</p><p><strong>Results: </strong>A total of 51 studies were included. Among these, 42 studies used traditional statistical methods, while 9 used AI / ML models. Logistic regression was the most common method among traditional models (95.2%). Across the ML studies, all of which used supervised models trained on tabular data, decision‑tree-based and linear algorithms (n=7, 77.8% each) were the most common, followed by neural networks and support vector machines (n = 4, 44.4% each). Traditional models achieved a C-statistic between 0.7 and 0.8 in 40.5% of cases (n = 17), with only 4.8% (n = 2) exceeding 0.9. AI models showed a C-statistic of 0.9 or higher in 44.4% of cases (n = 4). However, 77.8% of those ML-based models (n = 7) performed an internal cross validation and only 33.3% reported calibration data (n = 3), and none of the studies are externally validated, which raises important concerns about their current clinical applicability and generalizability.</p><p><strong>Conclusions: </strong>This systematic review, the first of its kind, observed that studies utilizing the ML models reported a potential for excellent classification accuracy in predicting SSI following spinal surgery. However, the current shortcomings in methodology limit their generalizability and immediate clinical implementation. For existing models, most ML studies remain in the early stages of development and its findings in excellent performance should be taken with caution. This review highlights the need for standardized model benchmarking and employing external validation to reliably assess generalizability. Furthermore, advancing beyond conventional tabular data by incorporating state-of-the art AI models that leverage multi-modal data could significantly expand the potential of predictive analytics in this domain - thus help guide clinical decision making.</p>","PeriodicalId":49484,"journal":{"name":"Spine Journal","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.spinee.2025.07.032","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background context: Surgical site infections (SSIs) are a significant complication following spinal surgery. These infections contribute to increased morbidity, prolonged hospital stays, and substantial healthcare costs. Traditional statistical models have been widely used to predict SSI risk, but artificial intelligence (AI) and its machine learning (ML) methods have also been used for SSI prediction.
Purpose: This systematic review aims to evaluate the predictive accuracy of AI models versus traditional statistical models in assessing SSI risk following spinal surgery.
Study design/setting: A systematic review was conducted following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines.
Methods: We searched Medline, Embase, Scopus, Web of Science, and ClinicalTrials.gov. Studies were included if they developed predictive models for SSI following spinal surgery using either AI or traditional statistical approaches. Risk of Bias for all studies was assessed using the Prediction model Risk of Bias Assessment Tool (PROBAST). Predictive model performance was compared using metrics such as the C-statistic and Area under the Receiver Operating Characteristic curve (AUC-ROC).
Results: A total of 51 studies were included. Among these, 42 studies used traditional statistical methods, while 9 used AI / ML models. Logistic regression was the most common method among traditional models (95.2%). Across the ML studies, all of which used supervised models trained on tabular data, decision‑tree-based and linear algorithms (n=7, 77.8% each) were the most common, followed by neural networks and support vector machines (n = 4, 44.4% each). Traditional models achieved a C-statistic between 0.7 and 0.8 in 40.5% of cases (n = 17), with only 4.8% (n = 2) exceeding 0.9. AI models showed a C-statistic of 0.9 or higher in 44.4% of cases (n = 4). However, 77.8% of those ML-based models (n = 7) performed an internal cross validation and only 33.3% reported calibration data (n = 3), and none of the studies are externally validated, which raises important concerns about their current clinical applicability and generalizability.
Conclusions: This systematic review, the first of its kind, observed that studies utilizing the ML models reported a potential for excellent classification accuracy in predicting SSI following spinal surgery. However, the current shortcomings in methodology limit their generalizability and immediate clinical implementation. For existing models, most ML studies remain in the early stages of development and its findings in excellent performance should be taken with caution. This review highlights the need for standardized model benchmarking and employing external validation to reliably assess generalizability. Furthermore, advancing beyond conventional tabular data by incorporating state-of-the art AI models that leverage multi-modal data could significantly expand the potential of predictive analytics in this domain - thus help guide clinical decision making.
期刊介绍:
The Spine Journal, the official journal of the North American Spine Society, is an international and multidisciplinary journal that publishes original, peer-reviewed articles on research and treatment related to the spine and spine care, including basic science and clinical investigations. It is a condition of publication that manuscripts submitted to The Spine Journal have not been published, and will not be simultaneously submitted or published elsewhere. The Spine Journal also publishes major reviews of specific topics by acknowledged authorities, technical notes, teaching editorials, and other special features, Letters to the Editor-in-Chief are encouraged.