Leveraging data-driven machine learning: From explainable risk prediction to hierarchical clustering-based subtypes of postoperative delirium in a prospective non-cardiac surgery cohort
Jin-yun Shi , Da-peng Gao , Rong Chen , Xiao-yi Hu , Lan-yue Zhu , Yue Zhang , Qing Li , Qing-hong Mao , Mu-huo Ji , Di Fan , Qing-ren Liu
{"title":"Leveraging data-driven machine learning: From explainable risk prediction to hierarchical clustering-based subtypes of postoperative delirium in a prospective non-cardiac surgery cohort","authors":"Jin-yun Shi , Da-peng Gao , Rong Chen , Xiao-yi Hu , Lan-yue Zhu , Yue Zhang , Qing Li , Qing-hong Mao , Mu-huo Ji , Di Fan , Qing-ren Liu","doi":"10.1016/j.jclinane.2025.112006","DOIUrl":null,"url":null,"abstract":"<div><h3>Study objective</h3><div>To leverage perioperative indicators in developing an explainable machine learning (ML) model for postoperative delirium (POD) prediction, discover distinct data-driven POD subtypes through hierarchical clustering analysis, and enhance personalized risk stratification to inform targeted clinical interventions.</div></div><div><h3>Methods</h3><div>This is a secondary analysis of several prospective observational studies, including 1106 patients who had non-cardiac surgery. Univariate analysis and the least absolute shrinkage and selection operator (LASSO) regression was used to screen essential features associated with POD. We compared six algorithms: adaptive boosting with classification trees, random forest (RF), neural networks, support vector machines, extreme gradient boosting with classification trees and logistic regression. SHapley Additive exPlanations (SHAP). was used to interpret the best one and to externally validate it in another large tertiary hospital. Among patients who developed POD, we conducted hierarchical clustering analysis on the risk factors (identified through univariate screening in the prediction model) to delineate distinct subtypes. We then compared the length of postoperative hospital stay and mortality rates (at 1, 3, 6, and 12 months postoperatively) between the identified clusters.</div></div><div><h3>Main results</h3><div>We identified 14 POD risk factors to develop ML models. The RF model performed best among the six ML models (area under the curve [AUC] of 0.85, 95 % confidence interval [CI], 0.78–0.91). SHAP analysis highlighted surgery duration, preoperative mini-mental state examination score, and Edmonton Frail Scale as the top predictors of POD. Hierarchical clustering identified three distinct POD subtypes: Subtype 1 (high-risk profile with significant comorbidity and inflammatory dysregulation, longest hospitalization: 21.5 days ([interquartile range (IQR) 19–28]; <em>p</em> < 0.001), Subtype 2 (resilient majority with optimal survival; Log-rank <em>p</em> < 0.001), and Subtype 3 (advanced age, frailty and low cognitive reserve, shortest hospitalization: 5 days [IQR 4–8]). Kaplan-Meier analysis showed significant 12-month survival differences among the subtypes (Subtype 2 > Subtype 3 > Subtype 1; <em>p</em> < 0.001).</div></div><div><h3>Conclusion</h3><div>Our study validated the utility of ML models, particularly RF, in predicting POD and identified three novel data-driven subtypes with distinct clinical characteristics.</div></div>","PeriodicalId":15506,"journal":{"name":"Journal of Clinical Anesthesia","volume":"107 ","pages":"Article 112006"},"PeriodicalIF":5.1000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Anesthesia","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952818025002673","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Study objective
To leverage perioperative indicators in developing an explainable machine learning (ML) model for postoperative delirium (POD) prediction, discover distinct data-driven POD subtypes through hierarchical clustering analysis, and enhance personalized risk stratification to inform targeted clinical interventions.
Methods
This is a secondary analysis of several prospective observational studies, including 1106 patients who had non-cardiac surgery. Univariate analysis and the least absolute shrinkage and selection operator (LASSO) regression was used to screen essential features associated with POD. We compared six algorithms: adaptive boosting with classification trees, random forest (RF), neural networks, support vector machines, extreme gradient boosting with classification trees and logistic regression. SHapley Additive exPlanations (SHAP). was used to interpret the best one and to externally validate it in another large tertiary hospital. Among patients who developed POD, we conducted hierarchical clustering analysis on the risk factors (identified through univariate screening in the prediction model) to delineate distinct subtypes. We then compared the length of postoperative hospital stay and mortality rates (at 1, 3, 6, and 12 months postoperatively) between the identified clusters.
Main results
We identified 14 POD risk factors to develop ML models. The RF model performed best among the six ML models (area under the curve [AUC] of 0.85, 95 % confidence interval [CI], 0.78–0.91). SHAP analysis highlighted surgery duration, preoperative mini-mental state examination score, and Edmonton Frail Scale as the top predictors of POD. Hierarchical clustering identified three distinct POD subtypes: Subtype 1 (high-risk profile with significant comorbidity and inflammatory dysregulation, longest hospitalization: 21.5 days ([interquartile range (IQR) 19–28]; p < 0.001), Subtype 2 (resilient majority with optimal survival; Log-rank p < 0.001), and Subtype 3 (advanced age, frailty and low cognitive reserve, shortest hospitalization: 5 days [IQR 4–8]). Kaplan-Meier analysis showed significant 12-month survival differences among the subtypes (Subtype 2 > Subtype 3 > Subtype 1; p < 0.001).
Conclusion
Our study validated the utility of ML models, particularly RF, in predicting POD and identified three novel data-driven subtypes with distinct clinical characteristics.
期刊介绍:
The Journal of Clinical Anesthesia (JCA) addresses all aspects of anesthesia practice, including anesthetic administration, pharmacokinetics, preoperative and postoperative considerations, coexisting disease and other complicating factors, cost issues, and similar concerns anesthesiologists contend with daily. Exceptionally high standards of presentation and accuracy are maintained.
The core of the journal is original contributions on subjects relevant to clinical practice, and rigorously peer-reviewed. Highly respected international experts have joined together to form the Editorial Board, sharing their years of experience and clinical expertise. Specialized section editors cover the various subspecialties within the field. To keep your practical clinical skills current, the journal bridges the gap between the laboratory and the clinical practice of anesthesiology and critical care to clarify how new insights can improve daily practice.