Rodney Allanigue Gabriel, Bhavya Harjai, Sierra Simpson, Austin Liu Du, Jeffrey Logan Tully, Olivier George, Ruth Waterman
{"title":"改进脊柱外科病例持续时间预测的集成学习方法:算法开发与验证。","authors":"Rodney Allanigue Gabriel, Bhavya Harjai, Sierra Simpson, Austin Liu Du, Jeffrey Logan Tully, Olivier George, Ruth Waterman","doi":"10.2196/39650","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Estimating surgical case duration accurately is an important operating room efficiency metric. Current predictive techniques in spine surgery include less sophisticated approaches such as classical multivariable statistical models. Machine learning approaches have been used to predict outcomes such as length of stay and time returning to normal work, but have not been focused on case duration.</p><p><strong>Objective: </strong>The primary objective of this 4-year, single-academic-center, retrospective study was to use an ensemble learning approach that may improve the accuracy of scheduled case duration for spine surgery. The primary outcome measure was case duration.</p><p><strong>Methods: </strong>We compared machine learning models using surgical and patient features to our institutional method, which used historic averages and surgeon adjustments as needed. We implemented multivariable linear regression, random forest, bagging, and XGBoost (Extreme Gradient Boosting) and calculated the average R<sup>2</sup>, root-mean-square error (RMSE), explained variance, and mean absolute error (MAE) using k-fold cross-validation. We then used the SHAP (Shapley Additive Explanations) explainer model to determine feature importance.</p><p><strong>Results: </strong>A total of 3189 patients who underwent spine surgery were included. The institution's current method of predicting case times has a very poor coefficient of determination with actual times (R<sup>2</sup>=0.213). On k-fold cross-validation, the linear regression model had an explained variance score of 0.345, an R<sup>2</sup> of 0.34, an RMSE of 162.84 minutes, and an MAE of 127.22 minutes. Among all models, the XGBoost regressor performed the best with an explained variance score of 0.778, an R<sup>2</sup> of 0.770, an RMSE of 92.95 minutes, and an MAE of 44.31 minutes. Based on SHAP analysis of the XGBoost regression, body mass index, spinal fusions, surgical procedure, and number of spine levels involved were the features with the most impact on the model.</p><p><strong>Conclusions: </strong>Using ensemble learning-based predictive models, specifically XGBoost regression, can improve the accuracy of the estimation of spine surgery times.</p>","PeriodicalId":73557,"journal":{"name":"JMIR perioperative medicine","volume":"6 ","pages":"e39650"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9912154/pdf/","citationCount":"5","resultStr":"{\"title\":\"An Ensemble Learning Approach to Improving Prediction of Case Duration for Spine Surgery: Algorithm Development and Validation.\",\"authors\":\"Rodney Allanigue Gabriel, Bhavya Harjai, Sierra Simpson, Austin Liu Du, Jeffrey Logan Tully, Olivier George, Ruth Waterman\",\"doi\":\"10.2196/39650\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Background: </strong>Estimating surgical case duration accurately is an important operating room efficiency metric. Current predictive techniques in spine surgery include less sophisticated approaches such as classical multivariable statistical models. Machine learning approaches have been used to predict outcomes such as length of stay and time returning to normal work, but have not been focused on case duration.</p><p><strong>Objective: </strong>The primary objective of this 4-year, single-academic-center, retrospective study was to use an ensemble learning approach that may improve the accuracy of scheduled case duration for spine surgery. The primary outcome measure was case duration.</p><p><strong>Methods: </strong>We compared machine learning models using surgical and patient features to our institutional method, which used historic averages and surgeon adjustments as needed. We implemented multivariable linear regression, random forest, bagging, and XGBoost (Extreme Gradient Boosting) and calculated the average R<sup>2</sup>, root-mean-square error (RMSE), explained variance, and mean absolute error (MAE) using k-fold cross-validation. We then used the SHAP (Shapley Additive Explanations) explainer model to determine feature importance.</p><p><strong>Results: </strong>A total of 3189 patients who underwent spine surgery were included. The institution's current method of predicting case times has a very poor coefficient of determination with actual times (R<sup>2</sup>=0.213). On k-fold cross-validation, the linear regression model had an explained variance score of 0.345, an R<sup>2</sup> of 0.34, an RMSE of 162.84 minutes, and an MAE of 127.22 minutes. Among all models, the XGBoost regressor performed the best with an explained variance score of 0.778, an R<sup>2</sup> of 0.770, an RMSE of 92.95 minutes, and an MAE of 44.31 minutes. Based on SHAP analysis of the XGBoost regression, body mass index, spinal fusions, surgical procedure, and number of spine levels involved were the features with the most impact on the model.</p><p><strong>Conclusions: </strong>Using ensemble learning-based predictive models, specifically XGBoost regression, can improve the accuracy of the estimation of spine surgery times.</p>\",\"PeriodicalId\":73557,\"journal\":{\"name\":\"JMIR perioperative medicine\",\"volume\":\"6 \",\"pages\":\"e39650\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9912154/pdf/\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"JMIR perioperative medicine\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2196/39650\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"JMIR perioperative medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2196/39650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Ensemble Learning Approach to Improving Prediction of Case Duration for Spine Surgery: Algorithm Development and Validation.
Background: Estimating surgical case duration accurately is an important operating room efficiency metric. Current predictive techniques in spine surgery include less sophisticated approaches such as classical multivariable statistical models. Machine learning approaches have been used to predict outcomes such as length of stay and time returning to normal work, but have not been focused on case duration.
Objective: The primary objective of this 4-year, single-academic-center, retrospective study was to use an ensemble learning approach that may improve the accuracy of scheduled case duration for spine surgery. The primary outcome measure was case duration.
Methods: We compared machine learning models using surgical and patient features to our institutional method, which used historic averages and surgeon adjustments as needed. We implemented multivariable linear regression, random forest, bagging, and XGBoost (Extreme Gradient Boosting) and calculated the average R2, root-mean-square error (RMSE), explained variance, and mean absolute error (MAE) using k-fold cross-validation. We then used the SHAP (Shapley Additive Explanations) explainer model to determine feature importance.
Results: A total of 3189 patients who underwent spine surgery were included. The institution's current method of predicting case times has a very poor coefficient of determination with actual times (R2=0.213). On k-fold cross-validation, the linear regression model had an explained variance score of 0.345, an R2 of 0.34, an RMSE of 162.84 minutes, and an MAE of 127.22 minutes. Among all models, the XGBoost regressor performed the best with an explained variance score of 0.778, an R2 of 0.770, an RMSE of 92.95 minutes, and an MAE of 44.31 minutes. Based on SHAP analysis of the XGBoost regression, body mass index, spinal fusions, surgical procedure, and number of spine levels involved were the features with the most impact on the model.
Conclusions: Using ensemble learning-based predictive models, specifically XGBoost regression, can improve the accuracy of the estimation of spine surgery times.