Utilizing machine learning models for predicting outcomes in acute pancreatitis: development and validation in three retrospective cohorts.

IF 3.3 3区医学 Q2 MEDICAL INFORMATICS

BMC Medical Informatics and Decision Making Pub Date : 2025-07-11 DOI:10.1186/s12911-025-03103-7

Kaier Gu, Yang Liu

{"title":"Utilizing machine learning models for predicting outcomes in acute pancreatitis: development and validation in three retrospective cohorts.","authors":"Kaier Gu, Yang Liu","doi":"10.1186/s12911-025-03103-7","DOIUrl":null,"url":null,"abstract":"Background: Acute pancreatitis (AP) is associated with a high readmission rate; however, there is a paucity of models capable of predicting post-discharge outcomes. Furthermore, existing in-hospital prediction models exhibit notable limitations. This study leverages machine learning (ML) technology to develop prognosis prediction models for AP patients, encompassing in-hospital mortality, readmission rates, and post-discharge mortality.Methods: A retrospective analysis was carried out on the clinical and laboratory data of AP patients from three databases (MIMIC database, eICU database, and Wenzhou Hospital in China), and they were divided into a training set and two validation sets. In the training set, key variables were screened using univariate logistic regression and the LASSO method. Six ML algorithms were employed to construct predictive models. The performance of these models was appraised using receiver operating characteristic curves, decision curve analysis, Shapley additive explanations plots, and other relevant metrics. A comparison was made between the predictive capabilities of the ML models and clinical scores. Subsequently, the performance of the machine learning models was subjected to further validation within two external validation sets.Results: A total of 2,559 AP patients were included. There were 12-26 variables selected for model training. Among the six ML models under assessment, the Logistic Regression, Random Forest, and eXtreme Gradient Boosting (XGB) models exhibited relatively superior performance in predicting in-hospital mortality, mortality within 180/365 days after discharge. Findings from the decision curve analysis and two external validation sets further indicated that the XGB model exhibited the optimal performance in predicting the in-hospital mortality of AP patients admitted to the intensive care unit. Specifically, the XGB model demonstrated stability in the area under the curve across different centers, achieved a balance between sensitivity and specificity, and effectively prevented overfitting through regularization mechanisms. These features are highly congruent with the core requirements for robustness in the medical context.Conclusions: By collecting the dynamic variables of patients during their hospitalization and establishing an XGB model, it is conducive to identifying the short-term and long-term prognoses of AP patients and promoting the decision-making of clinicians.Clinical trial number: Not applicable.","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"261"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12247377/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03103-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Acute pancreatitis (AP) is associated with a high readmission rate; however, there is a paucity of models capable of predicting post-discharge outcomes. Furthermore, existing in-hospital prediction models exhibit notable limitations. This study leverages machine learning (ML) technology to develop prognosis prediction models for AP patients, encompassing in-hospital mortality, readmission rates, and post-discharge mortality.

Methods: A retrospective analysis was carried out on the clinical and laboratory data of AP patients from three databases (MIMIC database, eICU database, and Wenzhou Hospital in China), and they were divided into a training set and two validation sets. In the training set, key variables were screened using univariate logistic regression and the LASSO method. Six ML algorithms were employed to construct predictive models. The performance of these models was appraised using receiver operating characteristic curves, decision curve analysis, Shapley additive explanations plots, and other relevant metrics. A comparison was made between the predictive capabilities of the ML models and clinical scores. Subsequently, the performance of the machine learning models was subjected to further validation within two external validation sets.

Results: A total of 2,559 AP patients were included. There were 12-26 variables selected for model training. Among the six ML models under assessment, the Logistic Regression, Random Forest, and eXtreme Gradient Boosting (XGB) models exhibited relatively superior performance in predicting in-hospital mortality, mortality within 180/365 days after discharge. Findings from the decision curve analysis and two external validation sets further indicated that the XGB model exhibited the optimal performance in predicting the in-hospital mortality of AP patients admitted to the intensive care unit. Specifically, the XGB model demonstrated stability in the area under the curve across different centers, achieved a balance between sensitivity and specificity, and effectively prevented overfitting through regularization mechanisms. These features are highly congruent with the core requirements for robustness in the medical context.

Conclusions: By collecting the dynamic variables of patients during their hospitalization and establishing an XGB model, it is conducive to identifying the short-term and long-term prognoses of AP patients and promoting the decision-making of clinicians.

Clinical trial number: Not applicable.

查看原文本刊更多论文

利用机器学习模型预测急性胰腺炎的预后：三个回顾性队列的开发和验证。

背景：急性胰腺炎（AP）与高再入院率相关；然而，目前缺乏能够预测出院后结果的模型。此外，现有的院内预测模型存在明显的局限性。本研究利用机器学习（ML）技术开发AP患者的预后预测模型，包括住院死亡率、再入院率和出院后死亡率。方法：回顾性分析3个数据库（MIMIC数据库、eICU数据库和中国温州医院数据库）中AP患者的临床和实验室资料，并将其分为训练集和2个验证集。在训练集中，使用单变量逻辑回归和LASSO方法筛选关键变量。采用6种ML算法构建预测模型。使用受试者工作特征曲线、决策曲线分析、Shapley加性解释图和其他相关指标来评价这些模型的性能。将ML模型的预测能力与临床评分进行比较。随后，在两个外部验证集中对机器学习模型的性能进行进一步验证。结果：共纳入2559例AP患者。共选取12-26个变量进行模型训练。在评估的6个ML模型中，Logistic回归、随机森林和极限梯度增强（XGB）模型在预测住院死亡率和出院后180/365天内死亡率方面表现出相对优越的性能。决策曲线分析和两个外部验证集的结果进一步表明，XGB模型在预测重症监护病房住院AP患者的住院死亡率方面表现出最优的性能。具体而言，XGB模型在不同中心的曲线下区域表现出稳定性，实现了敏感性和特异性的平衡，并通过正则化机制有效地防止了过拟合。这些特征与医学环境中鲁棒性的核心要求高度一致。结论：通过收集患者住院期间的动态变量，建立XGB模型，有助于识别AP患者的短期和长期预后，促进临床医生的决策。临床试验号：不适用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

BMC Medical Informatics and Decision Making 医学-医学：信息

CiteScore

7.20

自引率

5.70%

发文量

297

审稿时长

1 months

期刊介绍： BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.