Refining heart disease prediction accuracy using hybrid machine learning techniques with novel metaheuristic algorithms

IF 3.2 2区 医学 Q2 CARDIAC & CARDIOVASCULAR SYSTEMS
Haifeng Zhang , Rui Mu
{"title":"Refining heart disease prediction accuracy using hybrid machine learning techniques with novel metaheuristic algorithms","authors":"Haifeng Zhang ,&nbsp;Rui Mu","doi":"10.1016/j.ijcard.2024.132506","DOIUrl":null,"url":null,"abstract":"<div><p>Early diagnosis of heart disease is crucial, as it's one of the leading causes of death globally. Machine learning algorithms can be a powerful tool in achieving this goal. Therefore, this article aims to increase the accuracy of predicting heart disease using machine learning algorithms. Five classification models are explored: eXtreme Gradient Boosting (XGBC), Random Forest Classifier (RFC), Decision Tree Classifier (DTC), K-Nearest Neighbors Classifier (KNNC), and Logistic Regression Classifier (LRC). Additionally, four optimizers are evaluated: Slime mold Optimization Algorithm, Forest Optimization Algorithm, Pathfinder algorithm, and Giant Armadillo Optimization. To ensure robust model selection, a feature selection technique utilizing k-fold cross-validation is employed. This method identifies the most relevant features from the data, potentially improving model performance. The top three performing models are then coupled with the optimization algorithms to potentially enhance their generalizability and accuracy in predicting heart failure. In the final stage, the shortlisted models (XGBC, RFC, and DTC) were assessed using performance metrics like accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). This rigorous evaluation identified the XGGA hybrid model as the top performer, demonstrating its effectiveness in predicting heart failure. XGGA achieved impressive metrics, with an accuracy, precision, recall, and F1-score of 0.972 in the training phase, underscoring its robustness. Notably, the model's predictions deviated by less than 5.5 % for patients classified as alive and by less than 1.2 % for those classified as deceased compared to the actual outcomes, reflecting minimal error and high predictive reliability. In contrast, the DTC base model was the least effective, with an accuracy of 0.840 and a precision of 0.847. Overall, the optimization using the GAO algorithm significantly enhanced the performance of the models, highlighting the benefits of this approach.</p></div>","PeriodicalId":13710,"journal":{"name":"International journal of cardiology","volume":"416 ","pages":"Article 132506"},"PeriodicalIF":3.2000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of cardiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167527324011288","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0

Abstract

Early diagnosis of heart disease is crucial, as it's one of the leading causes of death globally. Machine learning algorithms can be a powerful tool in achieving this goal. Therefore, this article aims to increase the accuracy of predicting heart disease using machine learning algorithms. Five classification models are explored: eXtreme Gradient Boosting (XGBC), Random Forest Classifier (RFC), Decision Tree Classifier (DTC), K-Nearest Neighbors Classifier (KNNC), and Logistic Regression Classifier (LRC). Additionally, four optimizers are evaluated: Slime mold Optimization Algorithm, Forest Optimization Algorithm, Pathfinder algorithm, and Giant Armadillo Optimization. To ensure robust model selection, a feature selection technique utilizing k-fold cross-validation is employed. This method identifies the most relevant features from the data, potentially improving model performance. The top three performing models are then coupled with the optimization algorithms to potentially enhance their generalizability and accuracy in predicting heart failure. In the final stage, the shortlisted models (XGBC, RFC, and DTC) were assessed using performance metrics like accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). This rigorous evaluation identified the XGGA hybrid model as the top performer, demonstrating its effectiveness in predicting heart failure. XGGA achieved impressive metrics, with an accuracy, precision, recall, and F1-score of 0.972 in the training phase, underscoring its robustness. Notably, the model's predictions deviated by less than 5.5 % for patients classified as alive and by less than 1.2 % for those classified as deceased compared to the actual outcomes, reflecting minimal error and high predictive reliability. In contrast, the DTC base model was the least effective, with an accuracy of 0.840 and a precision of 0.847. Overall, the optimization using the GAO algorithm significantly enhanced the performance of the models, highlighting the benefits of this approach.

利用混合机器学习技术和新型元启发式算法提高心脏病预测的准确性。
心脏病是导致全球死亡的主要原因之一,因此早期诊断心脏病至关重要。机器学习算法是实现这一目标的有力工具。因此,本文旨在利用机器学习算法提高预测心脏病的准确性。本文探讨了五种分类模型:极端梯度提升(XGBC)、随机森林分类器(RFC)、决策树分类器(DTC)、K-近邻分类器(KNNC)和逻辑回归分类器(LRC)。此外,还对四种优化器进行了评估:粘菌优化算法、森林优化算法、探路者算法和巨型犰狳优化算法。为确保模型选择的稳健性,采用了一种利用 k 倍交叉验证的特征选择技术。这种方法能从数据中找出最相关的特征,从而提高模型性能。然后,将性能最好的三个模型与优化算法相结合,以提高它们在预测心衰方面的通用性和准确性。在最后阶段,使用准确度、精确度、召回率、F1-分数和马修斯相关系数(MCC)等性能指标对入围模型(XGBC、RFC 和 DTC)进行评估。这项严格的评估确定 XGGA 混合模型表现最佳,证明了它在预测心力衰竭方面的有效性。在训练阶段,XGGA 的准确度、精确度、召回率和 F1 分数均达到 0.972,取得了令人印象深刻的指标,凸显了其稳健性。值得注意的是,与实际结果相比,该模型对归类为存活患者的预测偏差小于 5.5%,对归类为死亡患者的预测偏差小于 1.2%,这反映了极小的误差和极高的预测可靠性。相比之下,DTC 基础模型的效果最差,准确率为 0.840,精确度为 0.847。总体而言,使用 GAO 算法进行的优化显著提高了模型的性能,凸显了这种方法的优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
International journal of cardiology
International journal of cardiology 医学-心血管系统
CiteScore
6.80
自引率
5.70%
发文量
758
审稿时长
44 days
期刊介绍: The International Journal of Cardiology is devoted to cardiology in the broadest sense. Both basic research and clinical papers can be submitted. The journal serves the interest of both practicing clinicians and researchers. In addition to original papers, we are launching a range of new manuscript types, including Consensus and Position Papers, Systematic Reviews, Meta-analyses, and Short communications. Case reports are no longer acceptable. Controversial techniques, issues on health policy and social medicine are discussed and serve as useful tools for encouraging debate.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信