Interpretable prediction of hospital mortality in bleeding critically ill patients based on machine learning and SHAP.

IF 3.3 3区医学 Q2 MEDICAL INFORMATICS

BMC Medical Informatics and Decision Making Pub Date : 2025-07-15 DOI:10.1186/s12911-025-03101-9

Bingkui Ren, Yuping Zhang, Siying Chen, Jinglong Dai, Junci Chong, Yifei Zhong, Mengkai Deng, Shaobo Jiang, Zhigang Chang

{"title":"Interpretable prediction of hospital mortality in bleeding critically ill patients based on machine learning and SHAP.","authors":"Bingkui Ren, Yuping Zhang, Siying Chen, Jinglong Dai, Junci Chong, Yifei Zhong, Mengkai Deng, Shaobo Jiang, Zhigang Chang","doi":"10.1186/s12911-025-03101-9","DOIUrl":null,"url":null,"abstract":"Background: Hemorrhage is a prevalent and critical condition in the intensive care unit (ICU), characterized by high incidence, elevated mortality rates, and substantial therapeutic challenges. Accurate prediction of mortality in patients with hemorrhage is essential for developing personalized prevention and treatment strategies. Nevertheless, the implementation of effective predictive models in clinical practice remains limited, primarily due to the lack of robust and interpretable tools.Objective: This study aimed to develop an interpretable model for predicting mortality risk in critically ill patients with hemorrhage admitted to ICUs. The SHapley Additive exPlanations (SHAP) method was applied to interpret the eXtreme Gradient Boosting (XGBoost)model, identifying key prognostic factors in this population.Methods: In this retrospective cohort study, we derived data from the eICU Collaborative Research Database (eICU-CRD) to develop and evaluate a predictive model. Clinical data from the first 24 h of ICU admission were extracted, and the dataset was randomly split into training (80%) and validation (20%) sets. Model performance was compared to four other machine learning algorithms using the area under the curve (AUC). SHAP was utilized to interpret the XGBoost model. External validation was subsequently performed using data from the Chinese REFRAIN cohort, which focuses on hemorrhage and coagulopathy in critically ill patients..Trial registration: The study protocol was retrospectively registered in the Chinese Clinical Trial Registry (ChiCTR) on December 17, 2024 (Registration number ChiCTR2400094140).Results: A total of 10,306 eligible patients with hemorrhage were included. The observed in-hospital mortality rate was 11.5%.Among the five models compared, XGBoost demonstrated the highest predictive performance (AUC = 0.81), whereas logistic regression (LR) showed the lowest generalizability(AUC = 0.726). Decision curve analysis revealed that the XGBoost model provided a greater net benefit than other models at threshold probabilities of 10-30%. SHAP analysis identified the top 15 predictors of mortality, with bilirubin level ranked as the most influential variable. External validation using the REFRAIN cohort confirmed the robustness of model(AUC = 0.776).Conclusions: The interpretable predictive model improves mortality risk stratification in ICU patients with hemorrhage, supporting clinicians in optimizing treatment plans and resource allocation. Enhanced model transparency through SHAP explanations may facilitate clinical adoption by improving trust in model reliability.","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"263"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03101-9","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Hemorrhage is a prevalent and critical condition in the intensive care unit (ICU), characterized by high incidence, elevated mortality rates, and substantial therapeutic challenges. Accurate prediction of mortality in patients with hemorrhage is essential for developing personalized prevention and treatment strategies. Nevertheless, the implementation of effective predictive models in clinical practice remains limited, primarily due to the lack of robust and interpretable tools.

Objective: This study aimed to develop an interpretable model for predicting mortality risk in critically ill patients with hemorrhage admitted to ICUs. The SHapley Additive exPlanations (SHAP) method was applied to interpret the eXtreme Gradient Boosting (XGBoost)model, identifying key prognostic factors in this population.

Methods: In this retrospective cohort study, we derived data from the eICU Collaborative Research Database (eICU-CRD) to develop and evaluate a predictive model. Clinical data from the first 24 h of ICU admission were extracted, and the dataset was randomly split into training (80%) and validation (20%) sets. Model performance was compared to four other machine learning algorithms using the area under the curve (AUC). SHAP was utilized to interpret the XGBoost model. External validation was subsequently performed using data from the Chinese REFRAIN cohort, which focuses on hemorrhage and coagulopathy in critically ill patients..

Trial registration: The study protocol was retrospectively registered in the Chinese Clinical Trial Registry (ChiCTR) on December 17, 2024 (Registration number ChiCTR2400094140).

Results: A total of 10,306 eligible patients with hemorrhage were included. The observed in-hospital mortality rate was 11.5%.Among the five models compared, XGBoost demonstrated the highest predictive performance (AUC = 0.81), whereas logistic regression (LR) showed the lowest generalizability(AUC = 0.726). Decision curve analysis revealed that the XGBoost model provided a greater net benefit than other models at threshold probabilities of 10-30%. SHAP analysis identified the top 15 predictors of mortality, with bilirubin level ranked as the most influential variable. External validation using the REFRAIN cohort confirmed the robustness of model(AUC = 0.776).

Conclusions: The interpretable predictive model improves mortality risk stratification in ICU patients with hemorrhage, supporting clinicians in optimizing treatment plans and resource allocation. Enhanced model transparency through SHAP explanations may facilitate clinical adoption by improving trust in model reliability.

查看原文本刊更多论文

基于机器学习和SHAP的出血重症患者住院死亡率可解释预测。

背景：出血是重症监护病房（ICU）的一种常见和危重疾病，其特点是发病率高、死亡率高、治疗难度大。准确预测出血患者的死亡率对于制定个性化的预防和治疗策略至关重要。然而，在临床实践中有效的预测模型的实施仍然有限，主要是由于缺乏强大的和可解释的工具。目的：本研究旨在建立一个可解释的模型来预测icu重症出血患者的死亡风险。SHapley加性解释（SHAP）方法用于解释极端梯度增强（XGBoost）模型，确定该人群的关键预后因素。方法：在这项回顾性队列研究中，我们从eICU合作研究数据库（eICU- crd）中获取数据来开发和评估预测模型。提取ICU入院前24小时的临床数据，将数据集随机分为训练集（80%）和验证集（20%）。使用曲线下面积（AUC）将模型性能与其他四种机器学习算法进行比较。利用SHAP对XGBoost模型进行解释。随后使用来自中国的队列数据进行外部验证，该队列主要关注危重患者的出血和凝血功能。试验注册：研究方案于2024年12月17日在中国临床试验注册中心（ChiCTR）回顾性注册（注册号ChiCTR2400094140）。结果：共纳入10306例符合条件的出血患者。住院死亡率为11.5%。在比较的五个模型中，XGBoost的预测性能最高（AUC = 0.81），而逻辑回归（LR）的泛化性最低（AUC = 0.726）。决策曲线分析显示，在10-30%的阈值概率下，XGBoost模型比其他模型提供了更大的净收益。SHAP分析确定了死亡率的前15个预测因素，其中胆红素水平被列为最具影响力的变量。外部验证使用的队列，证实了模型的稳健性（AUC = 0.776）。结论：可解释的预测模型改善了ICU出血患者的死亡风险分层，支持临床医生优化治疗方案和资源分配。通过SHAP解释增强模型透明度可以通过提高对模型可靠性的信任来促进临床采用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Informatics and Decision Making 医学-医学：信息

CiteScore

7.20

自引率

5.70%

发文量

297

审稿时长

1 months

期刊介绍： BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.