Fengwei Yao, Ji Luo, Qian Zhou, Luhua Wang, Zhijun He
{"title":"基于MIMIC-IV和eICU数据库的肝硬化患者肝肾综合征机器学习预测模型的开发和验证。","authors":"Fengwei Yao, Ji Luo, Qian Zhou, Luhua Wang, Zhijun He","doi":"10.1038/s41598-025-86674-9","DOIUrl":null,"url":null,"abstract":"<p><p>Hepatorenal syndrome (HRS) is a key contributor to poor prognosis in liver cirrhosis. This study aims to leverage the database to build a predictive model for early identification of high-risk patients. From two sizable public databases, we retrieved pertinent information about the cirrhosis patients' therapies, comorbidities, laboratory results, and demographics. Patients from the eICU database served as a test set for external validation, while patients from the MIMIC database were divided into training and validation groups. Variables were screened using LASSO regression, Extreme Gradient Boosting (XG Boost), and Random Forest (RF). Core risk factors were determined from the intersection of the three methods. A predictive model was constructed using multivariable logistic regression and visualized via a nomogram. Model performance was assessed using ROC curves, decision curve analysis (DCA), clinical impact curves (CIC), and calibration curves. Eight critical variables associated with HRS were identified using machine learning methods. The final predictive model, based on five key variables-spontaneous bacterial peritonitis, red blood cell count, creatinine, activated partial thromboplastin time, and total bilirubin-showed excellent discrimination, with AUCs of 0.832 (95% CI 0.8069-0.8563) in the training set and 0.8415 (95% CI 0.8042-0.8789) in the validation set. The AUC in the external test set was 0.8212 (95% CI 0.7784-0.864). By integrating the MIMIC-IV database and machine learning algorithms, we developed an effective predictive model for HRS in liver cirrhosis patients, providing a robust tool for early clinical intervention.</p>","PeriodicalId":21811,"journal":{"name":"Scientific Reports","volume":"15 1","pages":"2743"},"PeriodicalIF":3.9000,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751441/pdf/","citationCount":"0","resultStr":"{\"title\":\"Development and validation of a machine learning-based prediction model for hepatorenal syndrome in liver cirrhosis patients using MIMIC-IV and eICU databases.\",\"authors\":\"Fengwei Yao, Ji Luo, Qian Zhou, Luhua Wang, Zhijun He\",\"doi\":\"10.1038/s41598-025-86674-9\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Hepatorenal syndrome (HRS) is a key contributor to poor prognosis in liver cirrhosis. This study aims to leverage the database to build a predictive model for early identification of high-risk patients. From two sizable public databases, we retrieved pertinent information about the cirrhosis patients' therapies, comorbidities, laboratory results, and demographics. Patients from the eICU database served as a test set for external validation, while patients from the MIMIC database were divided into training and validation groups. Variables were screened using LASSO regression, Extreme Gradient Boosting (XG Boost), and Random Forest (RF). Core risk factors were determined from the intersection of the three methods. A predictive model was constructed using multivariable logistic regression and visualized via a nomogram. Model performance was assessed using ROC curves, decision curve analysis (DCA), clinical impact curves (CIC), and calibration curves. Eight critical variables associated with HRS were identified using machine learning methods. The final predictive model, based on five key variables-spontaneous bacterial peritonitis, red blood cell count, creatinine, activated partial thromboplastin time, and total bilirubin-showed excellent discrimination, with AUCs of 0.832 (95% CI 0.8069-0.8563) in the training set and 0.8415 (95% CI 0.8042-0.8789) in the validation set. The AUC in the external test set was 0.8212 (95% CI 0.7784-0.864). By integrating the MIMIC-IV database and machine learning algorithms, we developed an effective predictive model for HRS in liver cirrhosis patients, providing a robust tool for early clinical intervention.</p>\",\"PeriodicalId\":21811,\"journal\":{\"name\":\"Scientific Reports\",\"volume\":\"15 1\",\"pages\":\"2743\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-01-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11751441/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scientific Reports\",\"FirstCategoryId\":\"103\",\"ListUrlMain\":\"https://doi.org/10.1038/s41598-025-86674-9\",\"RegionNum\":2,\"RegionCategory\":\"综合性期刊\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MULTIDISCIPLINARY SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scientific Reports","FirstCategoryId":"103","ListUrlMain":"https://doi.org/10.1038/s41598-025-86674-9","RegionNum":2,"RegionCategory":"综合性期刊","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
摘要
肝肾综合征(HRS)是肝硬化预后不良的一个重要因素。本研究旨在利用该数据库建立预测模型,对高危患者进行早期识别。从两个相当大的公共数据库中,我们检索了有关肝硬化患者的治疗、合并症、实验室结果和人口统计学的相关信息。来自eICU数据库的患者作为外部验证的测试集,而来自MIMIC数据库的患者分为训练组和验证组。使用LASSO回归、极端梯度增强(XG Boost)和随机森林(RF)筛选变量。从三种方法的交叉点确定核心危险因素。采用多变量逻辑回归方法建立预测模型,并通过模态图进行可视化。采用ROC曲线、决策曲线分析(DCA)、临床影响曲线(CIC)和校准曲线评估模型性能。使用机器学习方法确定了与HRS相关的八个关键变量。基于自发性细菌性腹膜炎、红细胞计数、肌酐、活化部分凝血活素时间和总胆红素这五个关键变量的最终预测模型显示出极好的鉴别能力,训练集的auc为0.832 (95% CI 0.8069-0.8563),验证集的auc为0.8415 (95% CI 0.8042-0.8789)。外部测试集的AUC为0.8212 (95% CI为0.7784-0.864)。通过整合MIMIC-IV数据库和机器学习算法,我们开发了一个有效的肝硬化患者HRS预测模型,为早期临床干预提供了一个强大的工具。
Development and validation of a machine learning-based prediction model for hepatorenal syndrome in liver cirrhosis patients using MIMIC-IV and eICU databases.
Hepatorenal syndrome (HRS) is a key contributor to poor prognosis in liver cirrhosis. This study aims to leverage the database to build a predictive model for early identification of high-risk patients. From two sizable public databases, we retrieved pertinent information about the cirrhosis patients' therapies, comorbidities, laboratory results, and demographics. Patients from the eICU database served as a test set for external validation, while patients from the MIMIC database were divided into training and validation groups. Variables were screened using LASSO regression, Extreme Gradient Boosting (XG Boost), and Random Forest (RF). Core risk factors were determined from the intersection of the three methods. A predictive model was constructed using multivariable logistic regression and visualized via a nomogram. Model performance was assessed using ROC curves, decision curve analysis (DCA), clinical impact curves (CIC), and calibration curves. Eight critical variables associated with HRS were identified using machine learning methods. The final predictive model, based on five key variables-spontaneous bacterial peritonitis, red blood cell count, creatinine, activated partial thromboplastin time, and total bilirubin-showed excellent discrimination, with AUCs of 0.832 (95% CI 0.8069-0.8563) in the training set and 0.8415 (95% CI 0.8042-0.8789) in the validation set. The AUC in the external test set was 0.8212 (95% CI 0.7784-0.864). By integrating the MIMIC-IV database and machine learning algorithms, we developed an effective predictive model for HRS in liver cirrhosis patients, providing a robust tool for early clinical intervention.
期刊介绍:
We publish original research from all areas of the natural sciences, psychology, medicine and engineering. You can learn more about what we publish by browsing our specific scientific subject areas below or explore Scientific Reports by browsing all articles and collections.
Scientific Reports has a 2-year impact factor: 4.380 (2021), and is the 6th most-cited journal in the world, with more than 540,000 citations in 2020 (Clarivate Analytics, 2021).
•Engineering
Engineering covers all aspects of engineering, technology, and applied science. It plays a crucial role in the development of technologies to address some of the world''s biggest challenges, helping to save lives and improve the way we live.
•Physical sciences
Physical sciences are those academic disciplines that aim to uncover the underlying laws of nature — often written in the language of mathematics. It is a collective term for areas of study including astronomy, chemistry, materials science and physics.
•Earth and environmental sciences
Earth and environmental sciences cover all aspects of Earth and planetary science and broadly encompass solid Earth processes, surface and atmospheric dynamics, Earth system history, climate and climate change, marine and freshwater systems, and ecology. It also considers the interactions between humans and these systems.
•Biological sciences
Biological sciences encompass all the divisions of natural sciences examining various aspects of vital processes. The concept includes anatomy, physiology, cell biology, biochemistry and biophysics, and covers all organisms from microorganisms, animals to plants.
•Health sciences
The health sciences study health, disease and healthcare. This field of study aims to develop knowledge, interventions and technology for use in healthcare to improve the treatment of patients.