Explainable machine learning and online calculators to predict heart failure mortality in intensive care units.

IF 3.2 2区 医学 Q2 CARDIAC & CARDIOVASCULAR SYSTEMS
An-Tian Chen, Yuhui Zhang, Jian Zhang
{"title":"Explainable machine learning and online calculators to predict heart failure mortality in intensive care units.","authors":"An-Tian Chen, Yuhui Zhang, Jian Zhang","doi":"10.1002/ehf2.15062","DOIUrl":null,"url":null,"abstract":"<p><strong>Aims: </strong>This study aims to develop explainable machine learning models and clinical tools for predicting mortality in patients in the intensive care unit (ICU) with heart failure (HF).</p><p><strong>Methods: </strong>Patients diagnosed with HF who experienced their first ICU stay lasting between 24 h and 28 days were selected from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. The primary outcome was all-cause mortality within 28 days. Data analysis was performed using Python and R, with feature selection conducted via least absolute shrinkage and selection operator (LASSO) regression. Fifteen models were evaluated, and the most effective model was rendered explainable through the Shapley additive explanations (SHAP) approach. A nomogram was developed based on logistic regression to facilitate interpretation. For external validation, the eICU database was utilized.</p><p><strong>Results: </strong>After selection, the study included 2343 records, with 1808 surviving and 535 deceased patients. The median age of the study population was 70.00, with ~3/5 males (60.31%). The median length of stay in the ICU was 6.00 days. The median age of the survival group was younger than the non-survival group (69.00 vs. 73.00), and non-survival patients spent longer time in the ICU. Seventy-five features were initially selected, including basic information, vital signs, laboratory tests, haemodynamics and oxygen status. LASSO regression determined the shrinkage parameter α = 0.020, and 44 features were chosen for model construction. The linear discriminant analysis (LDA) model showed the best performance, and the accuracy reached 0.8354 in the training cohort and 0.8563 in the testing cohort. It showed satisfying area under the curve (AUC), recall, precision, F1 score, Cohen's kappa score and Matthew's correlation coefficient. The concordance index (c-index) reached 0.7972 in the training cohort and 0.8125 in the testing cohort. In external validation, the LDA model achieved approximately 0.9 in accuracy, precision, recall and F1 score, with an AUC of 0.79. Univariable analysis was performed in the training cohort. Features that differed significantly between the survival and non-survival groups were subjected to multiple logistic regression. The nomogram built on multiple logistic regression included 14 features and demonstrated excellent performance. The AUC of the nomogram is 0.852 in the training cohort, 0.855 in the internal validation cohort and 0.770 in the external validation cohort. The calibration curve showed good consistency.</p><p><strong>Conclusions: </strong>The study developed an LDA and a nomogram model for predicting mortality in HF patients in the ICU. The SHAP approach was employed to elucidate the LDA model, enhancing its utility for clinicians. These models were made accessible online for clinical application.</p>","PeriodicalId":11864,"journal":{"name":"ESC Heart Failure","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2024-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ESC Heart Failure","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/ehf2.15062","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Aims: This study aims to develop explainable machine learning models and clinical tools for predicting mortality in patients in the intensive care unit (ICU) with heart failure (HF).

Methods: Patients diagnosed with HF who experienced their first ICU stay lasting between 24 h and 28 days were selected from the Medical Information Mart for Intensive Care IV (MIMIC-IV) database. The primary outcome was all-cause mortality within 28 days. Data analysis was performed using Python and R, with feature selection conducted via least absolute shrinkage and selection operator (LASSO) regression. Fifteen models were evaluated, and the most effective model was rendered explainable through the Shapley additive explanations (SHAP) approach. A nomogram was developed based on logistic regression to facilitate interpretation. For external validation, the eICU database was utilized.

Results: After selection, the study included 2343 records, with 1808 surviving and 535 deceased patients. The median age of the study population was 70.00, with ~3/5 males (60.31%). The median length of stay in the ICU was 6.00 days. The median age of the survival group was younger than the non-survival group (69.00 vs. 73.00), and non-survival patients spent longer time in the ICU. Seventy-five features were initially selected, including basic information, vital signs, laboratory tests, haemodynamics and oxygen status. LASSO regression determined the shrinkage parameter α = 0.020, and 44 features were chosen for model construction. The linear discriminant analysis (LDA) model showed the best performance, and the accuracy reached 0.8354 in the training cohort and 0.8563 in the testing cohort. It showed satisfying area under the curve (AUC), recall, precision, F1 score, Cohen's kappa score and Matthew's correlation coefficient. The concordance index (c-index) reached 0.7972 in the training cohort and 0.8125 in the testing cohort. In external validation, the LDA model achieved approximately 0.9 in accuracy, precision, recall and F1 score, with an AUC of 0.79. Univariable analysis was performed in the training cohort. Features that differed significantly between the survival and non-survival groups were subjected to multiple logistic regression. The nomogram built on multiple logistic regression included 14 features and demonstrated excellent performance. The AUC of the nomogram is 0.852 in the training cohort, 0.855 in the internal validation cohort and 0.770 in the external validation cohort. The calibration curve showed good consistency.

Conclusions: The study developed an LDA and a nomogram model for predicting mortality in HF patients in the ICU. The SHAP approach was employed to elucidate the LDA model, enhancing its utility for clinicians. These models were made accessible online for clinical application.

可解释的机器学习和在线计算器,用于预测重症监护室心力衰竭的死亡率。
目的:本研究旨在开发可解释的机器学习模型和临床工具,用于预测重症监护病房(ICU)心力衰竭(HF)患者的死亡率:方法:从重症监护医学信息市场IV(MIMIC-IV)数据库中选取被诊断为心力衰竭且首次入住重症监护病房时间在24小时至28天之间的患者。主要结果是28天内的全因死亡率。数据分析使用 Python 和 R 进行,通过最小绝对收缩和选择算子(LASSO)回归进行特征选择。对 15 个模型进行了评估,并通过夏普利加法解释(SHAP)方法对最有效的模型进行了解释。为了便于解释,我们在逻辑回归的基础上开发了一个提名图。为了进行外部验证,使用了 eICU 数据库:经过筛选,研究共纳入 2343 份记录,其中存活患者 1808 人,死亡患者 535 人。研究对象的中位年龄为 70.00 岁,男性约占 3/5(60.31%)。在重症监护室的中位住院时间为 6.00 天。存活组的中位年龄比非存活组小(69.00 对 73.00),非存活患者在重症监护室的时间更长。初步筛选出 75 项特征,包括基本信息、生命体征、实验室检查、血液动力学和氧状态。LASSO 回归确定了收缩参数 α = 0.020,并选择了 44 个特征用于构建模型。线性判别分析(LDA)模型表现最佳,在训练队列中准确率达到 0.8354,在测试队列中准确率达到 0.8563。该模型的曲线下面积(AUC)、召回率、精确度、F1得分、科恩卡帕得分和马修相关系数均令人满意。训练队列中的一致性指数(c-index)达到 0.7972,测试队列中的一致性指数(c-index)达到 0.8125。在外部验证中,LDA 模型的准确度、精确度、召回率和 F1 分数均达到约 0.9,AUC 为 0.79。在训练队列中进行了单变量分析。对存活组和非存活组之间存在明显差异的特征进行了多元逻辑回归。多元逻辑回归建立的提名图包括 14 个特征,表现出卓越的性能。在训练队列中,提名图的 AUC 为 0.852,在内部验证队列中为 0.855,在外部验证队列中为 0.770。校准曲线显示出良好的一致性:该研究建立了一个 LDA 和提名图模型,用于预测重症监护室心房颤动患者的死亡率。采用 SHAP 方法阐明了 LDA 模型,提高了该模型对临床医生的实用性。这些模型可在线获取,供临床应用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
ESC Heart Failure
ESC Heart Failure Medicine-Cardiology and Cardiovascular Medicine
CiteScore
7.00
自引率
7.90%
发文量
461
审稿时长
12 weeks
期刊介绍: ESC Heart Failure is the open access journal of the Heart Failure Association of the European Society of Cardiology dedicated to the advancement of knowledge in the field of heart failure. The journal aims to improve the understanding, prevention, investigation and treatment of heart failure. Molecular and cellular biology, pathology, physiology, electrophysiology, pharmacology, as well as the clinical, social and population sciences all form part of the discipline that is heart failure. Accordingly, submission of manuscripts on basic, translational, clinical and population sciences is invited. Original contributions on nursing, care of the elderly, primary care, health economics and other specialist fields related to heart failure are also welcome, as are case reports that highlight interesting aspects of heart failure care and treatment.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信