Heart Disease Risk Prediction using Machine Learning with Principal Component Analysis

K. Reddy, I. Elamvazuthi, A. Aziz, S. Paramasivam, Hui Na Chua
{"title":"Heart Disease Risk Prediction using Machine Learning with Principal Component Analysis","authors":"K. Reddy, I. Elamvazuthi, A. Aziz, S. Paramasivam, Hui Na Chua","doi":"10.1109/ICIAS49414.2021.9642676","DOIUrl":null,"url":null,"abstract":"Cardiovascular diseases (CVDs) are killing about 17.9 million people every year. Early prediction can help people to change their lifestyles and to endure proper medical treatment if necessary. The data available in the healthcare sector is very useful to predict whether a patient will have a disease or not in the future. In this research, several machine learning algorithms such as Decision Tree (DT), Discriminant Analysis (DA), Logistic Regression (LR), Naïve Bayes (NB), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Ensemble were trained on Cleveland heart disease dataset. The performance of the algorithms was evaluated using 10-fold cross-validation without and with Principal Component Analysis (PCA). LR provided the highest accuracy of 85.8% with PCA by keeping 9 components and Ensemble classifiers and attained an accuracy of 83.8% using a Bagged tree with PCA by keeping 10 components.","PeriodicalId":212635,"journal":{"name":"2020 8th International Conference on Intelligent and Advanced Systems (ICIAS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 8th International Conference on Intelligent and Advanced Systems (ICIAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIAS49414.2021.9642676","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Cardiovascular diseases (CVDs) are killing about 17.9 million people every year. Early prediction can help people to change their lifestyles and to endure proper medical treatment if necessary. The data available in the healthcare sector is very useful to predict whether a patient will have a disease or not in the future. In this research, several machine learning algorithms such as Decision Tree (DT), Discriminant Analysis (DA), Logistic Regression (LR), Naïve Bayes (NB), Support Vector Machines (SVM), K-Nearest Neighbors (KNN), and Ensemble were trained on Cleveland heart disease dataset. The performance of the algorithms was evaluated using 10-fold cross-validation without and with Principal Component Analysis (PCA). LR provided the highest accuracy of 85.8% with PCA by keeping 9 components and Ensemble classifiers and attained an accuracy of 83.8% using a Bagged tree with PCA by keeping 10 components.
使用主成分分析的机器学习进行心脏病风险预测
心血管疾病(cvd)每年导致约1790万人死亡。早期预测可以帮助人们改变生活方式,并在必要时接受适当的治疗。医疗保健部门提供的数据对于预测患者将来是否会患病非常有用。在这项研究中,几种机器学习算法,如决策树(DT)、判别分析(DA)、逻辑回归(LR)、Naïve贝叶斯(NB)、支持向量机(SVM)、k近邻(KNN)和集成在克利夫兰心脏病数据集上进行了训练。采用主成分分析(PCA)和无主成分分析(PCA)的10倍交叉验证来评估算法的性能。LR通过保留9个成分和集成分类器提供PCA的最高准确率为85.8%,使用Bagged树与PCA通过保留10个成分获得准确率为83.8%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信