{"title":"Utilizing machine learning models for predicting outcomes in acute pancreatitis: development and validation in three retrospective cohorts.","authors":"Kaier Gu, Yang Liu","doi":"10.1186/s12911-025-03103-7","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Acute pancreatitis (AP) is associated with a high readmission rate; however, there is a paucity of models capable of predicting post-discharge outcomes. Furthermore, existing in-hospital prediction models exhibit notable limitations. This study leverages machine learning (ML) technology to develop prognosis prediction models for AP patients, encompassing in-hospital mortality, readmission rates, and post-discharge mortality.</p><p><strong>Methods: </strong>A retrospective analysis was carried out on the clinical and laboratory data of AP patients from three databases (MIMIC database, eICU database, and Wenzhou Hospital in China), and they were divided into a training set and two validation sets. In the training set, key variables were screened using univariate logistic regression and the LASSO method. Six ML algorithms were employed to construct predictive models. The performance of these models was appraised using receiver operating characteristic curves, decision curve analysis, Shapley additive explanations plots, and other relevant metrics. A comparison was made between the predictive capabilities of the ML models and clinical scores. Subsequently, the performance of the machine learning models was subjected to further validation within two external validation sets.</p><p><strong>Results: </strong>A total of 2,559 AP patients were included. There were 12-26 variables selected for model training. Among the six ML models under assessment, the Logistic Regression, Random Forest, and eXtreme Gradient Boosting (XGB) models exhibited relatively superior performance in predicting in-hospital mortality, mortality within 180/365 days after discharge. Findings from the decision curve analysis and two external validation sets further indicated that the XGB model exhibited the optimal performance in predicting the in-hospital mortality of AP patients admitted to the intensive care unit. Specifically, the XGB model demonstrated stability in the area under the curve across different centers, achieved a balance between sensitivity and specificity, and effectively prevented overfitting through regularization mechanisms. These features are highly congruent with the core requirements for robustness in the medical context.</p><p><strong>Conclusions: </strong>By collecting the dynamic variables of patients during their hospitalization and establishing an XGB model, it is conducive to identifying the short-term and long-term prognoses of AP patients and promoting the decision-making of clinicians.</p><p><strong>Clinical trial number: </strong>Not applicable.</p>","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":"25 1","pages":"261"},"PeriodicalIF":3.3000,"publicationDate":"2025-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12247377/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-025-03103-7","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Acute pancreatitis (AP) is associated with a high readmission rate; however, there is a paucity of models capable of predicting post-discharge outcomes. Furthermore, existing in-hospital prediction models exhibit notable limitations. This study leverages machine learning (ML) technology to develop prognosis prediction models for AP patients, encompassing in-hospital mortality, readmission rates, and post-discharge mortality.
Methods: A retrospective analysis was carried out on the clinical and laboratory data of AP patients from three databases (MIMIC database, eICU database, and Wenzhou Hospital in China), and they were divided into a training set and two validation sets. In the training set, key variables were screened using univariate logistic regression and the LASSO method. Six ML algorithms were employed to construct predictive models. The performance of these models was appraised using receiver operating characteristic curves, decision curve analysis, Shapley additive explanations plots, and other relevant metrics. A comparison was made between the predictive capabilities of the ML models and clinical scores. Subsequently, the performance of the machine learning models was subjected to further validation within two external validation sets.
Results: A total of 2,559 AP patients were included. There were 12-26 variables selected for model training. Among the six ML models under assessment, the Logistic Regression, Random Forest, and eXtreme Gradient Boosting (XGB) models exhibited relatively superior performance in predicting in-hospital mortality, mortality within 180/365 days after discharge. Findings from the decision curve analysis and two external validation sets further indicated that the XGB model exhibited the optimal performance in predicting the in-hospital mortality of AP patients admitted to the intensive care unit. Specifically, the XGB model demonstrated stability in the area under the curve across different centers, achieved a balance between sensitivity and specificity, and effectively prevented overfitting through regularization mechanisms. These features are highly congruent with the core requirements for robustness in the medical context.
Conclusions: By collecting the dynamic variables of patients during their hospitalization and establishing an XGB model, it is conducive to identifying the short-term and long-term prognoses of AP patients and promoting the decision-making of clinicians.
期刊介绍:
BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.