Leveraging data-driven machine learning: From explainable risk prediction to hierarchical clustering-based subtypes of postoperative delirium in a prospective non-cardiac surgery cohort

IF 5.1 2区医学 Q1 ANESTHESIOLOGY

Journal of Clinical Anesthesia Pub Date : 2025-09-18 DOI:10.1016/j.jclinane.2025.112006

Jin-yun Shi , Da-peng Gao , Rong Chen , Xiao-yi Hu , Lan-yue Zhu , Yue Zhang , Qing Li , Qing-hong Mao , Mu-huo Ji , Di Fan , Qing-ren Liu

{"title":"Leveraging data-driven machine learning: From explainable risk prediction to hierarchical clustering-based subtypes of postoperative delirium in a prospective non-cardiac surgery cohort","authors":"Jin-yun Shi , Da-peng Gao , Rong Chen , Xiao-yi Hu , Lan-yue Zhu , Yue Zhang , Qing Li , Qing-hong Mao , Mu-huo Ji , Di Fan , Qing-ren Liu","doi":"10.1016/j.jclinane.2025.112006","DOIUrl":null,"url":null,"abstract":"<div><h3>Study objective</h3><div>To leverage perioperative indicators in developing an explainable machine learning (ML) model for postoperative delirium (POD) prediction, discover distinct data-driven POD subtypes through hierarchical clustering analysis, and enhance personalized risk stratification to inform targeted clinical interventions.</div></div><div><h3>Methods</h3><div>This is a secondary analysis of several prospective observational studies, including 1106 patients who had non-cardiac surgery. Univariate analysis and the least absolute shrinkage and selection operator (LASSO) regression was used to screen essential features associated with POD. We compared six algorithms: adaptive boosting with classification trees, random forest (RF), neural networks, support vector machines, extreme gradient boosting with classification trees and logistic regression. SHapley Additive exPlanations (SHAP). was used to interpret the best one and to externally validate it in another large tertiary hospital. Among patients who developed POD, we conducted hierarchical clustering analysis on the risk factors (identified through univariate screening in the prediction model) to delineate distinct subtypes. We then compared the length of postoperative hospital stay and mortality rates (at 1, 3, 6, and 12 months postoperatively) between the identified clusters.</div></div><div><h3>Main results</h3><div>We identified 14 POD risk factors to develop ML models. The RF model performed best among the six ML models (area under the curve [AUC] of 0.85, 95 % confidence interval [CI], 0.78–0.91). SHAP analysis highlighted surgery duration, preoperative mini-mental state examination score, and Edmonton Frail Scale as the top predictors of POD. Hierarchical clustering identified three distinct POD subtypes: Subtype 1 (high-risk profile with significant comorbidity and inflammatory dysregulation, longest hospitalization: 21.5 days ([interquartile range (IQR) 19–28]; <em>p</em> < 0.001), Subtype 2 (resilient majority with optimal survival; Log-rank <em>p</em> < 0.001), and Subtype 3 (advanced age, frailty and low cognitive reserve, shortest hospitalization: 5 days [IQR 4–8]). Kaplan-Meier analysis showed significant 12-month survival differences among the subtypes (Subtype 2 > Subtype 3 > Subtype 1; <em>p</em> < 0.001).</div></div><div><h3>Conclusion</h3><div>Our study validated the utility of ML models, particularly RF, in predicting POD and identified three novel data-driven subtypes with distinct clinical characteristics.</div></div>","PeriodicalId":15506,"journal":{"name":"Journal of Clinical Anesthesia","volume":"107 ","pages":"Article 112006"},"PeriodicalIF":5.1000,"publicationDate":"2025-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Clinical Anesthesia","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0952818025002673","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ANESTHESIOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Study objective

To leverage perioperative indicators in developing an explainable machine learning (ML) model for postoperative delirium (POD) prediction, discover distinct data-driven POD subtypes through hierarchical clustering analysis, and enhance personalized risk stratification to inform targeted clinical interventions.

Methods

This is a secondary analysis of several prospective observational studies, including 1106 patients who had non-cardiac surgery. Univariate analysis and the least absolute shrinkage and selection operator (LASSO) regression was used to screen essential features associated with POD. We compared six algorithms: adaptive boosting with classification trees, random forest (RF), neural networks, support vector machines, extreme gradient boosting with classification trees and logistic regression. SHapley Additive exPlanations (SHAP). was used to interpret the best one and to externally validate it in another large tertiary hospital. Among patients who developed POD, we conducted hierarchical clustering analysis on the risk factors (identified through univariate screening in the prediction model) to delineate distinct subtypes. We then compared the length of postoperative hospital stay and mortality rates (at 1, 3, 6, and 12 months postoperatively) between the identified clusters.

Main results

We identified 14 POD risk factors to develop ML models. The RF model performed best among the six ML models (area under the curve [AUC] of 0.85, 95 % confidence interval [CI], 0.78–0.91). SHAP analysis highlighted surgery duration, preoperative mini-mental state examination score, and Edmonton Frail Scale as the top predictors of POD. Hierarchical clustering identified three distinct POD subtypes: Subtype 1 (high-risk profile with significant comorbidity and inflammatory dysregulation, longest hospitalization: 21.5 days ([interquartile range (IQR) 19–28]; p < 0.001), Subtype 2 (resilient majority with optimal survival; Log-rank p < 0.001), and Subtype 3 (advanced age, frailty and low cognitive reserve, shortest hospitalization: 5 days [IQR 4–8]). Kaplan-Meier analysis showed significant 12-month survival differences among the subtypes (Subtype 2 > Subtype 3 > Subtype 1; p < 0.001).

Conclusion

Our study validated the utility of ML models, particularly RF, in predicting POD and identified three novel data-driven subtypes with distinct clinical characteristics.

查看原文本刊更多论文

利用数据驱动的机器学习：从可解释的风险预测到前瞻性非心脏手术队列中基于分层聚类的术后谵妄亚型。

研究目的：利用围手术期指标建立可解释的术后谵妄（POD）预测机器学习（ML）模型，通过分层聚类分析发现不同的数据驱动的POD亚型，并加强个性化风险分层，为有针对性的临床干预提供信息。方法：这是对几项前瞻性观察性研究的二次分析，包括1106例非心脏手术患者。使用单因素分析和最小绝对收缩和选择算子（LASSO）回归来筛选与POD相关的基本特征。我们比较了六种算法：基于分类树的自适应增强、随机森林（RF）、神经网络、支持向量机、基于分类树的极端梯度增强和逻辑回归。SHapley加法解释（SHAP）。并在另一家大型三级医院进行了最佳解读和外部验证。在发生POD的患者中，我们对危险因素（通过预测模型中的单变量筛选确定）进行了分层聚类分析，以描绘不同的亚型。然后，我们比较了确定的群集之间的术后住院时间和死亡率（术后1、3、6和12个月）。主要结果：我们确定了14个POD危险因素来建立ML模型。RF模型在6种ML模型中表现最好（曲线下面积[AUC]为0.85,95%置信区间[CI]为0.78-0.91）。SHAP分析强调手术时间、术前精神状态检查评分和埃德蒙顿虚弱量表是POD的主要预测因素。分层聚类鉴定出三种不同的POD亚型：亚型1(高风险，有明显的共病和炎症失调，最长住院时间：21.5天([四分位数范围（IQR） 19-28]；p亚型3 >亚型1；结论：我们的研究验证了ML模型，特别是RF模型在预测POD方面的实用性，并确定了三种具有不同临床特征的新型数据驱动亚型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Clinical Anesthesia 医学-麻醉学

CiteScore

7.40

自引率

4.50%

发文量

346

审稿时长

23 days

期刊介绍： The Journal of Clinical Anesthesia (JCA) addresses all aspects of anesthesia practice, including anesthetic administration, pharmacokinetics, preoperative and postoperative considerations, coexisting disease and other complicating factors, cost issues, and similar concerns anesthesiologists contend with daily. Exceptionally high standards of presentation and accuracy are maintained. The core of the journal is original contributions on subjects relevant to clinical practice, and rigorously peer-reviewed. Highly respected international experts have joined together to form the Editorial Board, sharing their years of experience and clinical expertise. Specialized section editors cover the various subspecialties within the field. To keep your practical clinical skills current, the journal bridges the gap between the laboratory and the clinical practice of anesthesiology and critical care to clarify how new insights can improve daily practice.