{"title":"Deep learning models for ICU readmission prediction: a systematic review and meta-analysis","authors":"Emanuele Koumantakis, Konstantina Remoundou, Nicoletta Colombi, Carmen Fava, Ioanna Roussaki, Alessia Visconti, Paola Berchialla","doi":"10.1186/s13054-025-05642-x","DOIUrl":null,"url":null,"abstract":"Intensive Care Unit (ICU) readmissions are associated with increased morbidity, mortality, and healthcare costs. Therefore, determining an appropriate timing of ICU discharge is critical. In this context, deep learning (DL) approaches have attracted significant attention. We conducted a systematic review of studies developing or validating DL models for ICU readmission prediction, published up to March 4th, 2025, and indexed in PubMed, Embase, Scopus, and Web of Science. We summarised them along multiple dimensions, including outcome and population definition, DL architecture, reproducibility, generalizability, and explainability, and provided a meta-analytic estimate of model performance. We included 24 studies encompassing 49 DL models, predominantly trained on US-based datasets, and rarely subjected to external validation. There was considerable variability across study settings, including the definition and timeframe of the ICU readmission outcome, as well as DL architecture used, alongside a substantial risk of bias. Technical reproducibility and model interpretation were rare. A meta-analysis of AUROC values from 11 studies yielded a mean of 0.78 (95% CI = 0.72–0.84), with very high heterogeneity (I2 = 99.9%). Models targeting disease-specific ICU subpopulations achieved significantly higher performance (mean AUROC = 0.92, 95% CI = 0.89–0.95, p = 0.002), and substantially lower heterogeneity (I2 = 17.1%). DL models showed promising performances in predicting ICU readmissions, but exhibited several shortcomings, including low reproducibility, over-reliance on a few US-based datasets, and limited explainability. Additionally, the high heterogeneity and risk of bias limited our ability to assess their pooled performance through meta-analysis. Taken together, our observations suggest that the quality of the evidence regarding the application of DL approaches to ICU readmission prediction is poor, thus hindering their clinical applicability.","PeriodicalId":10811,"journal":{"name":"Critical Care","volume":"34 1","pages":""},"PeriodicalIF":9.3000,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Critical Care","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s13054-025-05642-x","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CRITICAL CARE MEDICINE","Score":null,"Total":0}
引用次数: 0
Abstract
Intensive Care Unit (ICU) readmissions are associated with increased morbidity, mortality, and healthcare costs. Therefore, determining an appropriate timing of ICU discharge is critical. In this context, deep learning (DL) approaches have attracted significant attention. We conducted a systematic review of studies developing or validating DL models for ICU readmission prediction, published up to March 4th, 2025, and indexed in PubMed, Embase, Scopus, and Web of Science. We summarised them along multiple dimensions, including outcome and population definition, DL architecture, reproducibility, generalizability, and explainability, and provided a meta-analytic estimate of model performance. We included 24 studies encompassing 49 DL models, predominantly trained on US-based datasets, and rarely subjected to external validation. There was considerable variability across study settings, including the definition and timeframe of the ICU readmission outcome, as well as DL architecture used, alongside a substantial risk of bias. Technical reproducibility and model interpretation were rare. A meta-analysis of AUROC values from 11 studies yielded a mean of 0.78 (95% CI = 0.72–0.84), with very high heterogeneity (I2 = 99.9%). Models targeting disease-specific ICU subpopulations achieved significantly higher performance (mean AUROC = 0.92, 95% CI = 0.89–0.95, p = 0.002), and substantially lower heterogeneity (I2 = 17.1%). DL models showed promising performances in predicting ICU readmissions, but exhibited several shortcomings, including low reproducibility, over-reliance on a few US-based datasets, and limited explainability. Additionally, the high heterogeneity and risk of bias limited our ability to assess their pooled performance through meta-analysis. Taken together, our observations suggest that the quality of the evidence regarding the application of DL approaches to ICU readmission prediction is poor, thus hindering their clinical applicability.
期刊介绍:
Critical Care is an esteemed international medical journal that undergoes a rigorous peer-review process to maintain its high quality standards. Its primary objective is to enhance the healthcare services offered to critically ill patients. To achieve this, the journal focuses on gathering, exchanging, disseminating, and endorsing evidence-based information that is highly relevant to intensivists. By doing so, Critical Care seeks to provide a thorough and inclusive examination of the intensive care field.