{"title":"利用混合机器学习技术和新型元启发式算法提高心脏病预测的准确性。","authors":"Haifeng Zhang , Rui Mu","doi":"10.1016/j.ijcard.2024.132506","DOIUrl":null,"url":null,"abstract":"<div><p>Early diagnosis of heart disease is crucial, as it's one of the leading causes of death globally. Machine learning algorithms can be a powerful tool in achieving this goal. Therefore, this article aims to increase the accuracy of predicting heart disease using machine learning algorithms. Five classification models are explored: eXtreme Gradient Boosting (XGBC), Random Forest Classifier (RFC), Decision Tree Classifier (DTC), K-Nearest Neighbors Classifier (KNNC), and Logistic Regression Classifier (LRC). Additionally, four optimizers are evaluated: Slime mold Optimization Algorithm, Forest Optimization Algorithm, Pathfinder algorithm, and Giant Armadillo Optimization. To ensure robust model selection, a feature selection technique utilizing k-fold cross-validation is employed. This method identifies the most relevant features from the data, potentially improving model performance. The top three performing models are then coupled with the optimization algorithms to potentially enhance their generalizability and accuracy in predicting heart failure. In the final stage, the shortlisted models (XGBC, RFC, and DTC) were assessed using performance metrics like accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). This rigorous evaluation identified the XGGA hybrid model as the top performer, demonstrating its effectiveness in predicting heart failure. XGGA achieved impressive metrics, with an accuracy, precision, recall, and F1-score of 0.972 in the training phase, underscoring its robustness. Notably, the model's predictions deviated by less than 5.5 % for patients classified as alive and by less than 1.2 % for those classified as deceased compared to the actual outcomes, reflecting minimal error and high predictive reliability. In contrast, the DTC base model was the least effective, with an accuracy of 0.840 and a precision of 0.847. Overall, the optimization using the GAO algorithm significantly enhanced the performance of the models, highlighting the benefits of this approach.</p></div>","PeriodicalId":13710,"journal":{"name":"International journal of cardiology","volume":"416 ","pages":"Article 132506"},"PeriodicalIF":3.2000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Refining heart disease prediction accuracy using hybrid machine learning techniques with novel metaheuristic algorithms\",\"authors\":\"Haifeng Zhang , Rui Mu\",\"doi\":\"10.1016/j.ijcard.2024.132506\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Early diagnosis of heart disease is crucial, as it's one of the leading causes of death globally. Machine learning algorithms can be a powerful tool in achieving this goal. Therefore, this article aims to increase the accuracy of predicting heart disease using machine learning algorithms. Five classification models are explored: eXtreme Gradient Boosting (XGBC), Random Forest Classifier (RFC), Decision Tree Classifier (DTC), K-Nearest Neighbors Classifier (KNNC), and Logistic Regression Classifier (LRC). Additionally, four optimizers are evaluated: Slime mold Optimization Algorithm, Forest Optimization Algorithm, Pathfinder algorithm, and Giant Armadillo Optimization. To ensure robust model selection, a feature selection technique utilizing k-fold cross-validation is employed. This method identifies the most relevant features from the data, potentially improving model performance. The top three performing models are then coupled with the optimization algorithms to potentially enhance their generalizability and accuracy in predicting heart failure. In the final stage, the shortlisted models (XGBC, RFC, and DTC) were assessed using performance metrics like accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). This rigorous evaluation identified the XGGA hybrid model as the top performer, demonstrating its effectiveness in predicting heart failure. XGGA achieved impressive metrics, with an accuracy, precision, recall, and F1-score of 0.972 in the training phase, underscoring its robustness. Notably, the model's predictions deviated by less than 5.5 % for patients classified as alive and by less than 1.2 % for those classified as deceased compared to the actual outcomes, reflecting minimal error and high predictive reliability. In contrast, the DTC base model was the least effective, with an accuracy of 0.840 and a precision of 0.847. Overall, the optimization using the GAO algorithm significantly enhanced the performance of the models, highlighting the benefits of this approach.</p></div>\",\"PeriodicalId\":13710,\"journal\":{\"name\":\"International journal of cardiology\",\"volume\":\"416 \",\"pages\":\"Article 132506\"},\"PeriodicalIF\":3.2000,\"publicationDate\":\"2024-08-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of cardiology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0167527324011288\",\"RegionNum\":2,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"CARDIAC & CARDIOVASCULAR SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of cardiology","FirstCategoryId":"3","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167527324011288","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"CARDIAC & CARDIOVASCULAR SYSTEMS","Score":null,"Total":0}
引用次数: 0
摘要
心脏病是导致全球死亡的主要原因之一,因此早期诊断心脏病至关重要。机器学习算法是实现这一目标的有力工具。因此,本文旨在利用机器学习算法提高预测心脏病的准确性。本文探讨了五种分类模型:极端梯度提升(XGBC)、随机森林分类器(RFC)、决策树分类器(DTC)、K-近邻分类器(KNNC)和逻辑回归分类器(LRC)。此外,还对四种优化器进行了评估:粘菌优化算法、森林优化算法、探路者算法和巨型犰狳优化算法。为确保模型选择的稳健性,采用了一种利用 k 倍交叉验证的特征选择技术。这种方法能从数据中找出最相关的特征,从而提高模型性能。然后,将性能最好的三个模型与优化算法相结合,以提高它们在预测心衰方面的通用性和准确性。在最后阶段,使用准确度、精确度、召回率、F1-分数和马修斯相关系数(MCC)等性能指标对入围模型(XGBC、RFC 和 DTC)进行评估。这项严格的评估确定 XGGA 混合模型表现最佳,证明了它在预测心力衰竭方面的有效性。在训练阶段,XGGA 的准确度、精确度、召回率和 F1 分数均达到 0.972,取得了令人印象深刻的指标,凸显了其稳健性。值得注意的是,与实际结果相比,该模型对归类为存活患者的预测偏差小于 5.5%,对归类为死亡患者的预测偏差小于 1.2%,这反映了极小的误差和极高的预测可靠性。相比之下,DTC 基础模型的效果最差,准确率为 0.840,精确度为 0.847。总体而言,使用 GAO 算法进行的优化显著提高了模型的性能,凸显了这种方法的优势。
Refining heart disease prediction accuracy using hybrid machine learning techniques with novel metaheuristic algorithms
Early diagnosis of heart disease is crucial, as it's one of the leading causes of death globally. Machine learning algorithms can be a powerful tool in achieving this goal. Therefore, this article aims to increase the accuracy of predicting heart disease using machine learning algorithms. Five classification models are explored: eXtreme Gradient Boosting (XGBC), Random Forest Classifier (RFC), Decision Tree Classifier (DTC), K-Nearest Neighbors Classifier (KNNC), and Logistic Regression Classifier (LRC). Additionally, four optimizers are evaluated: Slime mold Optimization Algorithm, Forest Optimization Algorithm, Pathfinder algorithm, and Giant Armadillo Optimization. To ensure robust model selection, a feature selection technique utilizing k-fold cross-validation is employed. This method identifies the most relevant features from the data, potentially improving model performance. The top three performing models are then coupled with the optimization algorithms to potentially enhance their generalizability and accuracy in predicting heart failure. In the final stage, the shortlisted models (XGBC, RFC, and DTC) were assessed using performance metrics like accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). This rigorous evaluation identified the XGGA hybrid model as the top performer, demonstrating its effectiveness in predicting heart failure. XGGA achieved impressive metrics, with an accuracy, precision, recall, and F1-score of 0.972 in the training phase, underscoring its robustness. Notably, the model's predictions deviated by less than 5.5 % for patients classified as alive and by less than 1.2 % for those classified as deceased compared to the actual outcomes, reflecting minimal error and high predictive reliability. In contrast, the DTC base model was the least effective, with an accuracy of 0.840 and a precision of 0.847. Overall, the optimization using the GAO algorithm significantly enhanced the performance of the models, highlighting the benefits of this approach.
期刊介绍:
The International Journal of Cardiology is devoted to cardiology in the broadest sense. Both basic research and clinical papers can be submitted. The journal serves the interest of both practicing clinicians and researchers.
In addition to original papers, we are launching a range of new manuscript types, including Consensus and Position Papers, Systematic Reviews, Meta-analyses, and Short communications. Case reports are no longer acceptable. Controversial techniques, issues on health policy and social medicine are discussed and serve as useful tools for encouraging debate.