{"title":"Prediction of lumbar disc degeneration based on interpretable machine learning models: retrospective cohort study.","authors":"Tenghui Li, Weihui Qi, Xinning Mao, Gaoyong Jia, Wei Zhang, Xiaofeng Li, Hao Pan, Dong Wang","doi":"10.1016/j.spinee.2025.04.004","DOIUrl":null,"url":null,"abstract":"<p><strong>Background context: </strong>The paraspinal muscles play a critical role in maintaining lumbar spine stability, and different muscles may have varying impacts on lumbar disc degeneration (LDD). However, studies exploring these relationships remain relatively limited.</p><p><strong>Purpose: </strong>This study aimed to investigate the relationship between various paravertebral muscles and LDD and to develop and validate a predictive model for LDD using machine learning (ML).</p><p><strong>Study design: </strong>Retrospective cohort study.</p><p><strong>Patient sample: </strong>A retrospective analysis was performed on hospitalized patients who underwent computed tomography (CT) and magnetic resonance imaging (MRI) examinations for chronic low back pain from February 2018 to January 2023.</p><p><strong>Outcome measures: </strong>The primary outcome measures included model performance metrics such as receiver operating characteristic (ROC) curves, accuracy, sensitivity, specificity, F1 score, positive predictive value (PPV), negative predictive value (NPV), and calibration curves. Clinical decision-making benefits were assessed using decision curve analysis (DCA). Secondary outcome measures focused on model interpretability, evaluated through SHapley Additive exPlanations (SHAP), which identified key predictors and quantified their contributions to LDD prediction.</p><p><strong>Methods: </strong>This study enrolled 518 patients as the internal cohort, who were randomly assigned to a training set (70%) and a test set (30%). The Synthetic Minority Oversampling Technique (SMOTE) was applied to mitigate class imbalance in the training set. Model parameters were optimized using grid search and 10-fold cross-validation to develop four machine learning models: Extreme Gradient Boosting (XGBoost), Random Forest (RF), Logistic Regression (LR), and Decision Tree (DT). External validation was performed using data from 343 patients from different tertiary medical centers. Paraspinal muscle parameters on lumbar spine CT and MRI images were measured using ImageJ, and LDD was evaluated based on the Pfirrmann grading system. Spearman correlation analysis and logistic regression were performed to assess factors associated with LDD. Model performance was evaluated using metrics such as ROC curves, accuracy, sensitivity, F1 score, PPV, NPV, calibration curves, and DCA. The SHAP method was employed to interpret the ML models.</p><p><strong>Results: </strong>This study included a total of 861 patients for analysis. In the external validation cohort, the XGBoost model demonstrated the best performance, achieving an AUC of 0.880 (95% CI: 0.826-0.935). Its accuracy (0.819), specificity (0.841), and positive predictive value (PPV=0.958) outperformed other models. Notably, it also exhibited superior sensitivity (0.814) and F1-score (0.880). SHAP analysis further revealed that age, the psoas muscle index (PMI), and the functional cross-sectional area (fCSA) of the multifidus muscle were critical predictors of LDD.</p><p><strong>Conclusion: </strong>In this study, an LDD prediction model was developed using paravertebral muscle quantitative data and ML algorithms, with SHAP analysis incorporated to enhance model interpretability. The XGBoost model demonstrated the best predictive performance and holds potential to guide early clinical prevention and treatment.</p>","PeriodicalId":49484,"journal":{"name":"Spine Journal","volume":" ","pages":""},"PeriodicalIF":4.7000,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spine Journal","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.spinee.2025.04.004","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CLINICAL NEUROLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background context: The paraspinal muscles play a critical role in maintaining lumbar spine stability, and different muscles may have varying impacts on lumbar disc degeneration (LDD). However, studies exploring these relationships remain relatively limited.
Purpose: This study aimed to investigate the relationship between various paravertebral muscles and LDD and to develop and validate a predictive model for LDD using machine learning (ML).
Study design: Retrospective cohort study.
Patient sample: A retrospective analysis was performed on hospitalized patients who underwent computed tomography (CT) and magnetic resonance imaging (MRI) examinations for chronic low back pain from February 2018 to January 2023.
Outcome measures: The primary outcome measures included model performance metrics such as receiver operating characteristic (ROC) curves, accuracy, sensitivity, specificity, F1 score, positive predictive value (PPV), negative predictive value (NPV), and calibration curves. Clinical decision-making benefits were assessed using decision curve analysis (DCA). Secondary outcome measures focused on model interpretability, evaluated through SHapley Additive exPlanations (SHAP), which identified key predictors and quantified their contributions to LDD prediction.
Methods: This study enrolled 518 patients as the internal cohort, who were randomly assigned to a training set (70%) and a test set (30%). The Synthetic Minority Oversampling Technique (SMOTE) was applied to mitigate class imbalance in the training set. Model parameters were optimized using grid search and 10-fold cross-validation to develop four machine learning models: Extreme Gradient Boosting (XGBoost), Random Forest (RF), Logistic Regression (LR), and Decision Tree (DT). External validation was performed using data from 343 patients from different tertiary medical centers. Paraspinal muscle parameters on lumbar spine CT and MRI images were measured using ImageJ, and LDD was evaluated based on the Pfirrmann grading system. Spearman correlation analysis and logistic regression were performed to assess factors associated with LDD. Model performance was evaluated using metrics such as ROC curves, accuracy, sensitivity, F1 score, PPV, NPV, calibration curves, and DCA. The SHAP method was employed to interpret the ML models.
Results: This study included a total of 861 patients for analysis. In the external validation cohort, the XGBoost model demonstrated the best performance, achieving an AUC of 0.880 (95% CI: 0.826-0.935). Its accuracy (0.819), specificity (0.841), and positive predictive value (PPV=0.958) outperformed other models. Notably, it also exhibited superior sensitivity (0.814) and F1-score (0.880). SHAP analysis further revealed that age, the psoas muscle index (PMI), and the functional cross-sectional area (fCSA) of the multifidus muscle were critical predictors of LDD.
Conclusion: In this study, an LDD prediction model was developed using paravertebral muscle quantitative data and ML algorithms, with SHAP analysis incorporated to enhance model interpretability. The XGBoost model demonstrated the best predictive performance and holds potential to guide early clinical prevention and treatment.
期刊介绍:
The Spine Journal, the official journal of the North American Spine Society, is an international and multidisciplinary journal that publishes original, peer-reviewed articles on research and treatment related to the spine and spine care, including basic science and clinical investigations. It is a condition of publication that manuscripts submitted to The Spine Journal have not been published, and will not be simultaneously submitted or published elsewhere. The Spine Journal also publishes major reviews of specific topics by acknowledged authorities, technical notes, teaching editorials, and other special features, Letters to the Editor-in-Chief are encouraged.