A Soft Voting Ensemble Classifier to Improve Survival Rate Predictions of Cardiovascular Heart Failure Patients

Arif Munandar, Wiga Maulana Baihaqi, Ade Nurhopipah
{"title":"A Soft Voting Ensemble Classifier to Improve Survival Rate Predictions of Cardiovascular Heart Failure Patients","authors":"Arif Munandar, Wiga Maulana Baihaqi, Ade Nurhopipah","doi":"10.33096/ilkom.v15i2.1632.344-352","DOIUrl":null,"url":null,"abstract":"Cardiovascular disease is one of the deadliest diseases, claiming around 17 million lives worldwide each year. According to data from the World Health Organization (WHO), more than four out of five deaths from cardiovascular disease are caused by heart attacks and strokes, and one-third of these deaths occur prematurely in people under the age of 70. Machine learning approaches can be used to detect the disease. This research aims to improve the prediction model of cardiovascular heart failure patient survival using C4.5, KNN, Logistic Regression algorithms, and the ensemble learning method of Voting Classifier. Based on the testing results, each model showed a significant increase in accuracy in the 70:30 ratio. Logistic Regression and C4.5 achieved the same accuracy, 89.47%, KNN obtained 91.23%, and Voting Classifier experienced a considerable improvement, reaching 94.74%. In testing with ratios of 90:10, 80:20, and 70:30, KNN demonstrated high accuracy but had significant overfitting, with a difference of 7-9% between training and testing accuracy scores in the 90:10 and 80:20 ratios. On the other hand, Voting Classifier showed stable performance in the 70:30 ratio, with an accuracy difference between training and testing scores below 1%. The conclusion of this research is that the Voting Classifier can assist the performance improvement of algorithms for classifying the survival expectancy of cardiovascular heart failure patients into 'Survived' or 'Deceased', compared to Logistic Regression, KNN, and C4.5.","PeriodicalId":33690,"journal":{"name":"Ilkom Jurnal Ilmiah","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ilkom Jurnal Ilmiah","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33096/ilkom.v15i2.1632.344-352","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Cardiovascular disease is one of the deadliest diseases, claiming around 17 million lives worldwide each year. According to data from the World Health Organization (WHO), more than four out of five deaths from cardiovascular disease are caused by heart attacks and strokes, and one-third of these deaths occur prematurely in people under the age of 70. Machine learning approaches can be used to detect the disease. This research aims to improve the prediction model of cardiovascular heart failure patient survival using C4.5, KNN, Logistic Regression algorithms, and the ensemble learning method of Voting Classifier. Based on the testing results, each model showed a significant increase in accuracy in the 70:30 ratio. Logistic Regression and C4.5 achieved the same accuracy, 89.47%, KNN obtained 91.23%, and Voting Classifier experienced a considerable improvement, reaching 94.74%. In testing with ratios of 90:10, 80:20, and 70:30, KNN demonstrated high accuracy but had significant overfitting, with a difference of 7-9% between training and testing accuracy scores in the 90:10 and 80:20 ratios. On the other hand, Voting Classifier showed stable performance in the 70:30 ratio, with an accuracy difference between training and testing scores below 1%. The conclusion of this research is that the Voting Classifier can assist the performance improvement of algorithms for classifying the survival expectancy of cardiovascular heart failure patients into 'Survived' or 'Deceased', compared to Logistic Regression, KNN, and C4.5.
提高心血管心力衰竭患者生存率预测的软投票集成分类器
心血管疾病是最致命的疾病之一,每年夺去全世界约1700万人的生命。根据世界卫生组织(世卫组织)的数据,五分之四以上的心血管疾病死亡是由心脏病发作和中风引起的,其中三分之一的死亡发生在70岁以下的人群中。机器学习方法可以用来检测这种疾病。本研究旨在利用C4.5、KNN、Logistic回归算法和投票分类器的集成学习方法,改进心血管心力衰竭患者生存预测模型。根据测试结果,在70:30的比例下,每个模型的准确率都有显著提高。Logistic回归与C4.5准确率相同,均为89.47%,KNN准确率为91.23%,投票分类器准确率有较大提高,达到94.74%。在90:10、80:20和70:30比例的测试中,KNN表现出较高的准确率,但存在显著的过拟合,在90:10和80:20比例下,训练和测试准确率得分相差7-9%。另一方面,投票分类器在70:30的比例下表现稳定,训练分数和测试分数之间的准确率差异小于1%。本研究的结论是,与Logistic回归、KNN和C4.5相比,投票分类器可以帮助提高将心血管心力衰竭患者的生存预期分类为“存活”或“死亡”的算法的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
4 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信