Carter M. Powell BA, William N. Newton MD, Robert J. Reis BS, John W. Moore BS, Brandon L. Rogalski MD, Josef K. Eichinger MD, Richard J. Friedman MD, FRCSC
{"title":"利用机器学习预测全肩关节置换术后并发症","authors":"Carter M. Powell BA, William N. Newton MD, Robert J. Reis BS, John W. Moore BS, Brandon L. Rogalski MD, Josef K. Eichinger MD, Richard J. Friedman MD, FRCSC","doi":"10.1053/j.sart.2024.12.006","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Previous efforts to use machine learning to predict complications following primary total shoulder arthroplasty (TSA) have shown promise, but the clinical significance of such predictive models has been limited by inadequate sample sizes and short (∼30 day) follow-up periods. The Nationwide Readmissions Database, with a large sample size and longer follow-up period, has the potential to reduce the noise of previous modeling efforts. The purpose of this study is to evaluate the accuracy and effectiveness of 4 different models for predicting 180-day complications, extended length of stay (LOS), and mechanical failures in patients undergoing primary TSA.</div></div><div><h3>Methods</h3><div>The Nationwide Readmissions Database was queried for patients who underwent TSA from 2016 to 2020. Primary outcomes were complications within 180 days, extended LOS (defined as >2 days), and mechanical failure. For each outcome, 4 models were created using Python v3.9. Models included a weighted logistic regression, random forest classifier, gradient boosting classifier, and an artificial neural network. Model performance was assessed using accuracy, area under the receiver operating characteristic curve (area under the curve [AUC]), sensitivity, positive predictive value (PPV), and F1 score.</div></div><div><h3>Results</h3><div>A total of 178,003 patients underwent primary TSA from 2016 to 2020. For predicting 180-day complications, gradient-boosted classification had the highest discriminative ability and sensitivity (accuracy: 0.69, AUC: 0.71, sensitivity: 0.59, PPV: 0.21, and F1: 0.31). For predicting extended LOS, an artificial neural network proved most effective (accuracy: 0.79, AUC: 0.82, sensitivity: 0.67, PPV: 0.43, and F1: 0.52; Table II). For mechanical complications, all models were equally poor at predicting complications.</div></div><div><h3>Conclusion</h3><div>Machine learning has the potential to accurately predict rare outcomes from heterogenous data; however, the quality of predictive models is dependent on the quality of the input data. Although machine-learning models are superior to simpler methods at predicting certain outcomes, such as extended LOS, they currently lack the sensitivity and PPV to be clinically significant.</div></div>","PeriodicalId":39885,"journal":{"name":"Seminars in Arthroplasty","volume":"35 2","pages":"Pages 203-209"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Using machine learning to predict postoperative complications of total shoulder arthroplasty\",\"authors\":\"Carter M. Powell BA, William N. Newton MD, Robert J. Reis BS, John W. Moore BS, Brandon L. Rogalski MD, Josef K. Eichinger MD, Richard J. Friedman MD, FRCSC\",\"doi\":\"10.1053/j.sart.2024.12.006\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><h3>Background</h3><div>Previous efforts to use machine learning to predict complications following primary total shoulder arthroplasty (TSA) have shown promise, but the clinical significance of such predictive models has been limited by inadequate sample sizes and short (∼30 day) follow-up periods. The Nationwide Readmissions Database, with a large sample size and longer follow-up period, has the potential to reduce the noise of previous modeling efforts. The purpose of this study is to evaluate the accuracy and effectiveness of 4 different models for predicting 180-day complications, extended length of stay (LOS), and mechanical failures in patients undergoing primary TSA.</div></div><div><h3>Methods</h3><div>The Nationwide Readmissions Database was queried for patients who underwent TSA from 2016 to 2020. Primary outcomes were complications within 180 days, extended LOS (defined as >2 days), and mechanical failure. For each outcome, 4 models were created using Python v3.9. Models included a weighted logistic regression, random forest classifier, gradient boosting classifier, and an artificial neural network. Model performance was assessed using accuracy, area under the receiver operating characteristic curve (area under the curve [AUC]), sensitivity, positive predictive value (PPV), and F1 score.</div></div><div><h3>Results</h3><div>A total of 178,003 patients underwent primary TSA from 2016 to 2020. For predicting 180-day complications, gradient-boosted classification had the highest discriminative ability and sensitivity (accuracy: 0.69, AUC: 0.71, sensitivity: 0.59, PPV: 0.21, and F1: 0.31). For predicting extended LOS, an artificial neural network proved most effective (accuracy: 0.79, AUC: 0.82, sensitivity: 0.67, PPV: 0.43, and F1: 0.52; Table II). For mechanical complications, all models were equally poor at predicting complications.</div></div><div><h3>Conclusion</h3><div>Machine learning has the potential to accurately predict rare outcomes from heterogenous data; however, the quality of predictive models is dependent on the quality of the input data. Although machine-learning models are superior to simpler methods at predicting certain outcomes, such as extended LOS, they currently lack the sensitivity and PPV to be clinically significant.</div></div>\",\"PeriodicalId\":39885,\"journal\":{\"name\":\"Seminars in Arthroplasty\",\"volume\":\"35 2\",\"pages\":\"Pages 203-209\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-01-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Seminars in Arthroplasty\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1045452725000069\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Seminars in Arthroplasty","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1045452725000069","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
Using machine learning to predict postoperative complications of total shoulder arthroplasty
Background
Previous efforts to use machine learning to predict complications following primary total shoulder arthroplasty (TSA) have shown promise, but the clinical significance of such predictive models has been limited by inadequate sample sizes and short (∼30 day) follow-up periods. The Nationwide Readmissions Database, with a large sample size and longer follow-up period, has the potential to reduce the noise of previous modeling efforts. The purpose of this study is to evaluate the accuracy and effectiveness of 4 different models for predicting 180-day complications, extended length of stay (LOS), and mechanical failures in patients undergoing primary TSA.
Methods
The Nationwide Readmissions Database was queried for patients who underwent TSA from 2016 to 2020. Primary outcomes were complications within 180 days, extended LOS (defined as >2 days), and mechanical failure. For each outcome, 4 models were created using Python v3.9. Models included a weighted logistic regression, random forest classifier, gradient boosting classifier, and an artificial neural network. Model performance was assessed using accuracy, area under the receiver operating characteristic curve (area under the curve [AUC]), sensitivity, positive predictive value (PPV), and F1 score.
Results
A total of 178,003 patients underwent primary TSA from 2016 to 2020. For predicting 180-day complications, gradient-boosted classification had the highest discriminative ability and sensitivity (accuracy: 0.69, AUC: 0.71, sensitivity: 0.59, PPV: 0.21, and F1: 0.31). For predicting extended LOS, an artificial neural network proved most effective (accuracy: 0.79, AUC: 0.82, sensitivity: 0.67, PPV: 0.43, and F1: 0.52; Table II). For mechanical complications, all models were equally poor at predicting complications.
Conclusion
Machine learning has the potential to accurately predict rare outcomes from heterogenous data; however, the quality of predictive models is dependent on the quality of the input data. Although machine-learning models are superior to simpler methods at predicting certain outcomes, such as extended LOS, they currently lack the sensitivity and PPV to be clinically significant.
期刊介绍:
Each issue of Seminars in Arthroplasty provides a comprehensive, current overview of a single topic in arthroplasty. The journal addresses orthopedic surgeons, providing authoritative reviews with emphasis on new developments relevant to their practice.