Unsupervised learning-derived phenotypes for personalized fluid management in critically ill patients with heart failure: A multicenter study

IF 6.8 1区医学 Q1 MEDICINE, RESEARCH & EXPERIMENTAL

Clinical and Translational Medicine Pub Date : 2024-11-08 DOI:10.1002/ctm2.70081

Chengjian Guan, Angwei Gong, Yan Zhao, Hangtian Yu, Shuaidan Zhang, Zhiyi Xie, Yehui Jin, Xiuchun Yang, Jingchao Lu, Bing Xiao

{"title":"Unsupervised learning-derived phenotypes for personalized fluid management in critically ill patients with heart failure: A multicenter study","authors":"Chengjian Guan, Angwei Gong, Yan Zhao, Hangtian Yu, Shuaidan Zhang, Zhiyi Xie, Yehui Jin, Xiuchun Yang, Jingchao Lu, Bing Xiao","doi":"10.1002/ctm2.70081","DOIUrl":null,"url":null,"abstract":"Dear Editor,Fluid balance management in critically ill heart failure (HF) patients remains a formidable clinical challenge. While clinicians typically aim for net negative fluid balance to alleviate symptoms, recent studies employing fixed strategies have yielded inconsistent results.1, 2 The 2024 Heart Failure Association guidelines of the European Society of Cardiology emphasized the importance of individualized fluid balance strategies, particularly for critically ill patients.3 Our study introduces a novel approach using unsupervised learning to identify four distinct phenotypes of critically ill HF patients, each with unique clinical characteristics and fluid balance requirements. To facilitate clinical application, we have developed a user-friendly interface that enables rapid phenotype identification and customized fluid management.We utilized two non-overlapping databases: III-CareVue subset and IV versions of the Intensive Care Medical Information Marketplace (MIMIC)4 for training cohorts and the eICU Collaborative Research Database (eICU)5 for external validation (Method S1). The MIMIC cohort comprised 5998 patients, while the eICU cohort included 2549 patients (Figure S1). We initially extracted 56 variables from the first day of ICU admission. After eliminating variables with more than 30% missing data, 47 variables remained, encompassing demographics, comorbidities, laboratory values, vital signs, interventions, and severity scores. To ensure a balanced contribution of characteristics, all data underwent cleaning and normalization (Method S2, Figure S2). In-hospital mortality served as our primary outcome, with ICU length of stay and total hospital length of stay as secondary outcomes.Uniform Manifold Approximation and Projection (UMAP) was used to determine that there were no differences in clinical characteristics between the two training databases (Figure S3). To classify patients, we applied the K-prototypes clustering algorithm, which effectively accommodates mixed numerical and categorical attributes while preserving the characteristics of factorial variables (Method S3). The optimal number of clusters was determined using standard tests, considering both statistical metrics and clinical relevance. This approach ultimately identified four distinct phenotypes (Figure S4).Comparative analysis of these phenotypes revealed distinct clinical profiles (Figure 1, Table 1, Table S1). Phenotype A was characterized by aggressive interventions and inflammation, including high rates of vasoactive drug use, antibiotic use, and mechanical ventilation. This group also exhibited the highest white blood cell count and chloride levels, coupled with the lowest platelet count. Phenotype B represented the mildest form with the most favourable prognosis. Phenotype C was distinguished by the highest mean age, lowest body weight, higher comorbidity burden, and second-highest mortality rate, despite having the lowest Sequential Organ Failure Assessment (SOFA) score. Phenotype D presented the most severe clinical profile with a poor prognosis. Short-and long-term survival outcomes differed significantly among these phenotypes, with Phenotype D showing the worst prognosis and Phenotype B the best prognosis (Figure 2A,B). To validate our findings, we conducted correlation analysis among continuous variables and excluded highly correlated factors before re-clustering. The resulting phenotypes retained consistent characteristics, confirming the stability of our clustering method (Figures S5–S7).Additionally, we investigated phenotype-specific fluid management strategies using net daily fluid balance data, adjusted for demographic factors, laboratory parameters, vital signs, and interventions. The impact of these strategies on in-hospital mortality was analyzed using the parametric g-formula (Method S4), focusing on the first seven days of ICU admission. Figure 2C illustrates how different fluid management strategies influence in-hospital mortality across the four patient phenotypes over 7 days: Phenotype A presented with severe respiratory failure, shock, and inflammation, and benefited from fluid balances between −1000 and 500 mL daily. This aligned with recent studies on acute respiratory distress syndrome (ARDS) and ventilator-related events, confirming the adverse effects of positive fluid balance on mechanical ventilation duration and mortality.6, 7 Phenotype C, despite milder clinical parameters, exhibited significantly higher mortality rates compared to Phenotype B, likely due to older age and multiple comorbidities. We recommended a daily net fluid balance ranging from −1500 and 500 mL for this group, underscoring the impact of age and frailty on HF prognosis.8 Phenotype D, characterized by severe metabolic derangements including acidosis, renal dysfunction, and sepsis, had the poorest prognosis. Our results suggested a more restrictive fluid strategy (−2000 to −500 mL daily), consistent with studies demonstrating adverse effects of positive fluid balance in populations with kidney disease and sepsis.9, 10 Phenotype B showed no clear benefit from specific fluid management strategies, warranting further investigation to determine whether this reflected the relative mildness of their condition or methodological limitations.To facilitate the efficient classification of HF phenotypes across different cohorts, we developed a machine learning-based classification model. The Joint Mutual Information Maximiza (JMIM) method identified nine variables with a feature importance score >.8, including age, blood urea nitrogen (BUN), hematocrit, vasoactive drugs, renal disease, creatinine, diastolic blood pressure (DBP), mechanical ventilation, and anion gap (Figure 2D). Based on benchmark tests (Table S3), we selected the Extreme Gradient Boosting (XGBoost) model for phenotype classification. The model achieved high predictive performance in MIMIC (AUC:.918–.943) (Figure 2E) and satisfactory performance in the eICU cohort used for external validation (AUC:.802–.907) (Figure S9). We evaluated the performance metrics and decision curve analysis of the model, and the results showed that the XGBoost model had good performance and clinical net benefit (Table S4, Figure S8). Then, we conducted an interpretability analysis to visualize the model's decision-making process (Figures S10–S12). To support clinical application, we developed a web-based tool (https://7kdtqk-guanchengcheng.shinyapps.io/hf_phenotype/).While our study provided valuable insights, several limitations warranted acknowledgement. First, the retrospective nature of the study precluded clear causal inference. Additionally, the absence of some important variables (such as ejection fraction and natriuretic peptides) might have led to the omission of potentially significant factors. Future randomized controlled trials were necessary to confirm the efficacy of phenotype-specific fluid management strategies. Second, our analysis primarily focused on net fluid intake and its impact on prognosis. Further research was needed to explore the effects of infusion rates and individual fluid responsiveness on patient outcomes.In summary, we have identified four distinct phenotypes of critically ill heart failure patients, each with unique clinical characteristics and fluid management needs. Our novel classification model and user interface facilitated rapid phenotype identification, enabling personalized fluid management strategies.Bing Xiao contributed to the research design. Chengjian Guan, Angwei Gong, and Yan Zhao contributed to data collection, data processing, and graphing. Chengjian Guan, Hangtian Yu, and Shuaidan Zhang conducted model construction and deployment. Zhiyi Xie, Yehui Jin, Xiuchun Yang, and Jingchao Lu contributed to data proofreading and formal analysis. Chengjian Guan and Angwei Gong contributed to the writing of the manuscript. Xiao Bing contributed to the review and editing. All authors have read and approved the final manuscript.The authors declare no conflict of interest.This study was conducted in accordance with the Declaration of Helsinki. Since the public databases used in this study all use de-identified data, individual informed consent is not required, so we informed the Ethics Committee of this without a written report.","PeriodicalId":10189,"journal":{"name":"Clinical and Translational Medicine","volume":"14 11","pages":""},"PeriodicalIF":6.8000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11546239/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Clinical and Translational Medicine","FirstCategoryId":"3","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ctm2.70081","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MEDICINE, RESEARCH & EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

Dear Editor,

Fluid balance management in critically ill heart failure (HF) patients remains a formidable clinical challenge. While clinicians typically aim for net negative fluid balance to alleviate symptoms, recent studies employing fixed strategies have yielded inconsistent results.^{1, 2} The 2024 Heart Failure Association guidelines of the European Society of Cardiology emphasized the importance of individualized fluid balance strategies, particularly for critically ill patients.³ Our study introduces a novel approach using unsupervised learning to identify four distinct phenotypes of critically ill HF patients, each with unique clinical characteristics and fluid balance requirements. To facilitate clinical application, we have developed a user-friendly interface that enables rapid phenotype identification and customized fluid management.

We utilized two non-overlapping databases: III-CareVue subset and IV versions of the Intensive Care Medical Information Marketplace (MIMIC)⁴ for training cohorts and the eICU Collaborative Research Database (eICU)⁵ for external validation (Method S1). The MIMIC cohort comprised 5998 patients, while the eICU cohort included 2549 patients (Figure S1). We initially extracted 56 variables from the first day of ICU admission. After eliminating variables with more than 30% missing data, 47 variables remained, encompassing demographics, comorbidities, laboratory values, vital signs, interventions, and severity scores. To ensure a balanced contribution of characteristics, all data underwent cleaning and normalization (Method S2, Figure S2). In-hospital mortality served as our primary outcome, with ICU length of stay and total hospital length of stay as secondary outcomes.

Uniform Manifold Approximation and Projection (UMAP) was used to determine that there were no differences in clinical characteristics between the two training databases (Figure S3). To classify patients, we applied the K-prototypes clustering algorithm, which effectively accommodates mixed numerical and categorical attributes while preserving the characteristics of factorial variables (Method S3). The optimal number of clusters was determined using standard tests, considering both statistical metrics and clinical relevance. This approach ultimately identified four distinct phenotypes (Figure S4).

Comparative analysis of these phenotypes revealed distinct clinical profiles (Figure 1, Table 1, Table S1). Phenotype A was characterized by aggressive interventions and inflammation, including high rates of vasoactive drug use, antibiotic use, and mechanical ventilation. This group also exhibited the highest white blood cell count and chloride levels, coupled with the lowest platelet count. Phenotype B represented the mildest form with the most favourable prognosis. Phenotype C was distinguished by the highest mean age, lowest body weight, higher comorbidity burden, and second-highest mortality rate, despite having the lowest Sequential Organ Failure Assessment (SOFA) score. Phenotype D presented the most severe clinical profile with a poor prognosis. Short-and long-term survival outcomes differed significantly among these phenotypes, with Phenotype D showing the worst prognosis and Phenotype B the best prognosis (Figure 2A,B). To validate our findings, we conducted correlation analysis among continuous variables and excluded highly correlated factors before re-clustering. The resulting phenotypes retained consistent characteristics, confirming the stability of our clustering method (Figures S5–S7).

Additionally, we investigated phenotype-specific fluid management strategies using net daily fluid balance data, adjusted for demographic factors, laboratory parameters, vital signs, and interventions. The impact of these strategies on in-hospital mortality was analyzed using the parametric g-formula (Method S4), focusing on the first seven days of ICU admission. Figure 2C illustrates how different fluid management strategies influence in-hospital mortality across the four patient phenotypes over 7 days: Phenotype A presented with severe respiratory failure, shock, and inflammation, and benefited from fluid balances between −1000 and 500 mL daily. This aligned with recent studies on acute respiratory distress syndrome (ARDS) and ventilator-related events, confirming the adverse effects of positive fluid balance on mechanical ventilation duration and mortality.^{6, 7} Phenotype C, despite milder clinical parameters, exhibited significantly higher mortality rates compared to Phenotype B, likely due to older age and multiple comorbidities. We recommended a daily net fluid balance ranging from −1500 and 500 mL for this group, underscoring the impact of age and frailty on HF prognosis.⁸ Phenotype D, characterized by severe metabolic derangements including acidosis, renal dysfunction, and sepsis, had the poorest prognosis. Our results suggested a more restrictive fluid strategy (−2000 to −500 mL daily), consistent with studies demonstrating adverse effects of positive fluid balance in populations with kidney disease and sepsis.^{9, 10} Phenotype B showed no clear benefit from specific fluid management strategies, warranting further investigation to determine whether this reflected the relative mildness of their condition or methodological limitations.

To facilitate the efficient classification of HF phenotypes across different cohorts, we developed a machine learning-based classification model. The Joint Mutual Information Maximiza (JMIM) method identified nine variables with a feature importance score >.8, including age, blood urea nitrogen (BUN), hematocrit, vasoactive drugs, renal disease, creatinine, diastolic blood pressure (DBP), mechanical ventilation, and anion gap (Figure 2D). Based on benchmark tests (Table S3), we selected the Extreme Gradient Boosting (XGBoost) model for phenotype classification. The model achieved high predictive performance in MIMIC (AUC:.918–.943) (Figure 2E) and satisfactory performance in the eICU cohort used for external validation (AUC:.802–.907) (Figure S9). We evaluated the performance metrics and decision curve analysis of the model, and the results showed that the XGBoost model had good performance and clinical net benefit (Table S4, Figure S8). Then, we conducted an interpretability analysis to visualize the model's decision-making process (Figures S10–S12). To support clinical application, we developed a web-based tool (https://7kdtqk-guanchengcheng.shinyapps.io/hf_phenotype/).

While our study provided valuable insights, several limitations warranted acknowledgement. First, the retrospective nature of the study precluded clear causal inference. Additionally, the absence of some important variables (such as ejection fraction and natriuretic peptides) might have led to the omission of potentially significant factors. Future randomized controlled trials were necessary to confirm the efficacy of phenotype-specific fluid management strategies. Second, our analysis primarily focused on net fluid intake and its impact on prognosis. Further research was needed to explore the effects of infusion rates and individual fluid responsiveness on patient outcomes.

In summary, we have identified four distinct phenotypes of critically ill heart failure patients, each with unique clinical characteristics and fluid management needs. Our novel classification model and user interface facilitated rapid phenotype identification, enabling personalized fluid management strategies.

Bing Xiao contributed to the research design. Chengjian Guan, Angwei Gong, and Yan Zhao contributed to data collection, data processing, and graphing. Chengjian Guan, Hangtian Yu, and Shuaidan Zhang conducted model construction and deployment. Zhiyi Xie, Yehui Jin, Xiuchun Yang, and Jingchao Lu contributed to data proofreading and formal analysis. Chengjian Guan and Angwei Gong contributed to the writing of the manuscript. Xiao Bing contributed to the review and editing. All authors have read and approved the final manuscript.

The authors declare no conflict of interest.

This study was conducted in accordance with the Declaration of Helsinki. Since the public databases used in this study all use de-identified data, individual informed consent is not required, so we informed the Ethics Committee of this without a written report.

Abstract Image

查看原文本刊更多论文

用于心力衰竭重症患者个性化输液管理的无监督学习衍生表型：一项多中心研究。

我们的研究结果表明，应采取限制性更强的输液策略（每日-2000 至-500 毫升），这与肾脏疾病和败血症患者正性液体平衡的不良影响研究结果一致。9, 10 表型 B 没有从特定的液体管理策略中明显获益，这需要进一步调查，以确定这是否反映了其病情相对较轻或方法的局限性。联合互信息最大化（JMIM）方法确定了九个特征重要性得分为>.8的变量，包括年龄、血尿素氮（BUN）、血细胞比容、血管活性药物、肾脏疾病、肌酐、舒张压（DBP）、机械通气和阴离子间隙（图2D）。根据基准测试（表 S3），我们选择了极端梯度提升（XGBoost）模型进行表型分类。该模型在 MIMIC 中获得了较高的预测性能（AUC：.918-.943）（图 2E），在用于外部验证的 eICU 队列中也获得了令人满意的性能（AUC：.802-.907）（图 S9）。我们评估了模型的性能指标和决策曲线分析，结果显示 XGBoost 模型具有良好的性能和临床净效益（表 S4，图 S8）。然后，我们进行了可解释性分析，将模型的决策过程可视化（图 S10-S12）。为了支持临床应用，我们开发了一个基于网络的工具（https://7kdtqk-guanchengcheng.shinyapps.io/hf_phenotype/）。虽然我们的研究提供了有价值的见解，但有几个局限性值得肯定。首先，研究的回顾性使我们无法做出明确的因果推断。此外，一些重要变量（如射血分数和钠尿肽）的缺失可能导致潜在的重要因素被忽略。未来有必要进行随机对照试验，以证实针对表型的液体管理策略的有效性。其次，我们的分析主要侧重于净液体摄入量及其对预后的影响。总之，我们确定了四种不同表型的重症心衰患者，每种表型都有独特的临床特征和液体管理需求。我们新颖的分类模型和用户界面有助于快速识别表型，从而制定个性化的输液管理策略。关成建、龚安伟和赵艳参与了数据收集、数据处理和图表绘制。关成建、于航天和张帅丹进行了模型构建和部署。谢志毅、金烨辉、杨秀春和卢景超参与了数据校对和形式分析。关成建和龚安伟参与了手稿的撰写。肖冰参与了审稿和编辑工作。所有作者均已阅读并批准最终稿件。作者声明无利益冲突。由于本研究中使用的公共数据库均使用了去标识化数据，因此不需要个人知情同意，所以我们在没有书面报告的情况下告知了伦理委员会。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Clinical and Translational Medicine Multiple-

CiteScore

15.90

自引率

1.90%

发文量

450

审稿时长

4 weeks

期刊介绍： Clinical and Translational Medicine (CTM) is an international, peer-reviewed, open-access journal dedicated to accelerating the translation of preclinical research into clinical applications and fostering communication between basic and clinical scientists. It highlights the clinical potential and application of various fields including biotechnologies, biomaterials, bioengineering, biomarkers, molecular medicine, omics science, bioinformatics, immunology, molecular imaging, drug discovery, regulation, and health policy. With a focus on the bench-to-bedside approach, CTM prioritizes studies and clinical observations that generate hypotheses relevant to patients and diseases, guiding investigations in cellular and molecular medicine. The journal encourages submissions from clinicians, researchers, policymakers, and industry professionals.