Tariq A. Dam , Daan de Bruin , Giovanni Cinà , Patrick J. Thoral , Paul W.G. Elbers , Corstiaan A. den Uil , Reinier F. Crane
{"title":"ICU readmission and mortality risk prediction: Generalizability of a multi-hospital model","authors":"Tariq A. Dam , Daan de Bruin , Giovanni Cinà , Patrick J. Thoral , Paul W.G. Elbers , Corstiaan A. den Uil , Reinier F. Crane","doi":"10.1016/j.jointm.2025.03.007","DOIUrl":null,"url":null,"abstract":"<div><h3>Background</h3><div>Inadvertent intensive care unit (ICU) readmission is associated with longer length of stay and increased mortality. Conversely, delayed ICU discharge may represent inefficient use of resources. To better inform discharge timing, several hospitals have implemented machine learning models to predict readmission risk following discharge. However, these models are typically created locally and may not generalize well to other hospitals or patient populations. A single multi-hospital-based model might provide more accurate predictions and insight into features that are applicable across diverse clinical settings.</div></div><div><h3>Methods</h3><div>This study involved a retrospective multi-center cohort from one academic hospital (Amsterdam University Medical Center [AUMC]) and two large teaching hospitals (Maasstad Ziekenhuis [MSZ] and OLVG). Data from the latter two hospitals were combined to create a pooled model, which was tested on the academic hospital dataset. Data relating to all adult ICU patients were included, starting from the implementation of the electronic health record system until the commencement of model development for each hospital. An XGBoost model was trained to predict a composite outcome of readmission or mortality within 7 days and an autoencoder was used as an out-of-distribution (OOD) detector to capture dataset heterogeneity.</div></div><div><h3>Results</h3><div>In total, 44,837 patients were available for analysis across the three hospitals. The average readmission rates were 7.1 %, 6.9 %, and 5.9 % for MSZ, OLVG, and AUMC, respectively. Performance evaluation of the local models on AUMC data demonstrated weighted area under the receiver operating characteristic curves of 69.7 %±0.8 %, 70.5 %±0.5 %, and 76.5 %±1.9 %, respectively, whereas the pooled model achieved a weighted area under the receiver operating characteristic curves of 71.1 %±0.7 %. The difference between internal and external performance was reduced when cardiac surgery patients were excluded. The key features across models were albumin levels and the use of oxygen therapy.</div></div><div><h3>Discussion</h3><div>A single, multi-hospital-based model performed comparably on external datasets, especially when cardiac surgery patients were excluded. However, when applied externally, model predictions risk being uncalibrated for specific patient subgroups and require careful calibration before implementation. While external models were more stable than local ones over OOD scores, their performance was comparable after excluding cardiac surgery patients. Although pooling data marginally improved performance on external datasets, the incorporation of data from diverse hospitals is likely to provide greater benefits.</div></div>","PeriodicalId":73799,"journal":{"name":"Journal of intensive medicine","volume":"5 4","pages":"Pages 377-384"},"PeriodicalIF":0.0000,"publicationDate":"2025-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of intensive medicine","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2667100X25000301","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Background
Inadvertent intensive care unit (ICU) readmission is associated with longer length of stay and increased mortality. Conversely, delayed ICU discharge may represent inefficient use of resources. To better inform discharge timing, several hospitals have implemented machine learning models to predict readmission risk following discharge. However, these models are typically created locally and may not generalize well to other hospitals or patient populations. A single multi-hospital-based model might provide more accurate predictions and insight into features that are applicable across diverse clinical settings.
Methods
This study involved a retrospective multi-center cohort from one academic hospital (Amsterdam University Medical Center [AUMC]) and two large teaching hospitals (Maasstad Ziekenhuis [MSZ] and OLVG). Data from the latter two hospitals were combined to create a pooled model, which was tested on the academic hospital dataset. Data relating to all adult ICU patients were included, starting from the implementation of the electronic health record system until the commencement of model development for each hospital. An XGBoost model was trained to predict a composite outcome of readmission or mortality within 7 days and an autoencoder was used as an out-of-distribution (OOD) detector to capture dataset heterogeneity.
Results
In total, 44,837 patients were available for analysis across the three hospitals. The average readmission rates were 7.1 %, 6.9 %, and 5.9 % for MSZ, OLVG, and AUMC, respectively. Performance evaluation of the local models on AUMC data demonstrated weighted area under the receiver operating characteristic curves of 69.7 %±0.8 %, 70.5 %±0.5 %, and 76.5 %±1.9 %, respectively, whereas the pooled model achieved a weighted area under the receiver operating characteristic curves of 71.1 %±0.7 %. The difference between internal and external performance was reduced when cardiac surgery patients were excluded. The key features across models were albumin levels and the use of oxygen therapy.
Discussion
A single, multi-hospital-based model performed comparably on external datasets, especially when cardiac surgery patients were excluded. However, when applied externally, model predictions risk being uncalibrated for specific patient subgroups and require careful calibration before implementation. While external models were more stable than local ones over OOD scores, their performance was comparable after excluding cardiac surgery patients. Although pooling data marginally improved performance on external datasets, the incorporation of data from diverse hospitals is likely to provide greater benefits.