An ensemble machine learning model for predicting one-year mortality in elderly coronary heart disease patients with anemia

IF 8.6 2区 计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS
Longcan Cheng, Yan Nie, Hongxia Wen, Yan Li, Yali Zhao, Qian Zhang, Mingxing Lei, Shihui Fu
{"title":"An ensemble machine learning model for predicting one-year mortality in elderly coronary heart disease patients with anemia","authors":"Longcan Cheng, Yan Nie, Hongxia Wen, Yan Li, Yali Zhao, Qian Zhang, Mingxing Lei, Shihui Fu","doi":"10.1186/s40537-024-00966-x","DOIUrl":null,"url":null,"abstract":"<h3 data-test=\"abstract-sub-heading\">Objective</h3><p>This study was designed to develop and validate a robust predictive model for one-year mortality in elderly coronary heart disease (CHD) patients with anemia using machine learning methods.</p><h3 data-test=\"abstract-sub-heading\">Methods</h3><p>Demographics, tests, comorbidities, and drugs were collected for a cohort of 974 elderly patients with CHD. A prospective analysis was performed to evaluate predictive performances of the developed models. External validation of models was performed in a series of 112 elderly CHD patients with anemia.</p><h3 data-test=\"abstract-sub-heading\">Results</h3><p>The overall one-year mortality was 43.6%. Risk factors included heart rate, chronic heart failure, tachycardia and β receptor blockers. Protective factors included hemoglobin, albumin, high density lipoprotein cholesterol, estimated glomerular filtration rate (eGFR), left ventricular ejection fraction (LVEF), aspirin, clopidogrel, calcium channel blockers, angiotensin converting enzyme inhibitors (ACEIs)/angiotensin receptor blockers (ARBs), and statins. Compared with other algorithms, an ensemble machine learning model performed the best with area under the curve (95% confidence interval) being 0.828 (0.805–0.870) and Brier score being 0.170. Calibration and density curves further confirmed favorable predicted probability and discriminative ability of an ensemble machine learning model. External validation of Ensemble Model also exhibited good performance with area under the curve (95% confidence interval) being 0.825 (0.734–0.916) and Brier score being 0.185. Patients in the high-risk group had more than six-fold probability of one-year mortality compared with those in the low-risk group (<i>P</i> &lt; 0.001). Shaley Additive exPlanation identified the top five risk factors that associated with one-year mortality were hemoglobin, albumin, eGFR, LVEF, and ACEIs/ARBs.</p><h3 data-test=\"abstract-sub-heading\">Conclusions</h3><p>This model identifies key risk factors and protective factors, providing valuable insights for improving risk assessment, informing clinical decision-making and performing targeted interventions. It outperforms other algorithms with predictive performance and provides significant opportunities for personalized risk mitigation strategies, with clinical implications for improving patient care.</p>","PeriodicalId":15158,"journal":{"name":"Journal of Big Data","volume":"17 1","pages":""},"PeriodicalIF":8.6000,"publicationDate":"2024-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Big Data","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s40537-024-00966-x","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}
引用次数: 0

Abstract

Objective

This study was designed to develop and validate a robust predictive model for one-year mortality in elderly coronary heart disease (CHD) patients with anemia using machine learning methods.

Methods

Demographics, tests, comorbidities, and drugs were collected for a cohort of 974 elderly patients with CHD. A prospective analysis was performed to evaluate predictive performances of the developed models. External validation of models was performed in a series of 112 elderly CHD patients with anemia.

Results

The overall one-year mortality was 43.6%. Risk factors included heart rate, chronic heart failure, tachycardia and β receptor blockers. Protective factors included hemoglobin, albumin, high density lipoprotein cholesterol, estimated glomerular filtration rate (eGFR), left ventricular ejection fraction (LVEF), aspirin, clopidogrel, calcium channel blockers, angiotensin converting enzyme inhibitors (ACEIs)/angiotensin receptor blockers (ARBs), and statins. Compared with other algorithms, an ensemble machine learning model performed the best with area under the curve (95% confidence interval) being 0.828 (0.805–0.870) and Brier score being 0.170. Calibration and density curves further confirmed favorable predicted probability and discriminative ability of an ensemble machine learning model. External validation of Ensemble Model also exhibited good performance with area under the curve (95% confidence interval) being 0.825 (0.734–0.916) and Brier score being 0.185. Patients in the high-risk group had more than six-fold probability of one-year mortality compared with those in the low-risk group (P < 0.001). Shaley Additive exPlanation identified the top five risk factors that associated with one-year mortality were hemoglobin, albumin, eGFR, LVEF, and ACEIs/ARBs.

Conclusions

This model identifies key risk factors and protective factors, providing valuable insights for improving risk assessment, informing clinical decision-making and performing targeted interventions. It outperforms other algorithms with predictive performance and provides significant opportunities for personalized risk mitigation strategies, with clinical implications for improving patient care.

Abstract Image

预测患有贫血的老年冠心病患者一年死亡率的集合机器学习模型
方法 收集了一组 974 名老年冠心病患者的人口统计学资料、检查、合并症和药物。为评估所开发模型的预测性能,进行了前瞻性分析。结果 一年内的总死亡率为 43.6%。风险因素包括心率、慢性心衰、心动过速和β受体阻滞剂。保护因素包括血红蛋白、白蛋白、高密度脂蛋白胆固醇、估计肾小球滤过率(eGFR)、左室射血分数(LVEF)、阿司匹林、氯吡格雷、钙通道阻滞剂、血管紧张素转换酶抑制剂(ACEI)/血管紧张素受体阻滞剂(ARB)和他汀类药物。与其他算法相比,集合机器学习模型表现最佳,曲线下面积(95% 置信区间)为 0.828(0.805-0.870),布赖尔评分为 0.170。校准和密度曲线进一步证实了集合机器学习模型良好的预测概率和判别能力。集合模型的外部验证也显示出良好的性能,曲线下面积(95% 置信区间)为 0.825(0.734-0.916),Brier 评分为 0.185。与低风险组相比,高风险组患者的一年期死亡率是低风险组的六倍多(P < 0.001)。Shaley Additive exPlanation 确定了与一年死亡率相关的五大风险因素,分别是血红蛋白、白蛋白、eGFR、LVEF 和 ACEI/ARB。该模型的预测性能优于其他算法,为个性化风险缓解策略提供了重要机会,对改善患者护理具有临床意义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Journal of Big Data
Journal of Big Data Computer Science-Information Systems
CiteScore
17.80
自引率
3.70%
发文量
105
审稿时长
13 weeks
期刊介绍: The Journal of Big Data publishes high-quality, scholarly research papers, methodologies, and case studies covering a broad spectrum of topics, from big data analytics to data-intensive computing and all applications of big data research. It addresses challenges facing big data today and in the future, including data capture and storage, search, sharing, analytics, technologies, visualization, architectures, data mining, machine learning, cloud computing, distributed systems, and scalable storage. The journal serves as a seminal source of innovative material for academic researchers and practitioners alike.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信