PERFORMANCE EVALUATION OF MACHINE LEARNING ALGORITHMS IN THE DIAGNOSIS AND CLASSIFICATION OF HEART DISEASES

Olanloye Odunayo, Olawumi Olasunkanmi, A. Adeyemo, Adebayo Segun
{"title":"PERFORMANCE EVALUATION OF MACHINE LEARNING ALGORITHMS IN THE DIAGNOSIS AND CLASSIFICATION OF HEART DISEASES","authors":"Olanloye Odunayo, Olawumi Olasunkanmi, A. Adeyemo, Adebayo Segun","doi":"10.59568/jasic-2022-3-1-03","DOIUrl":null,"url":null,"abstract":"Clinical reports and research have established that heart diseases are a typical example of cardiovascular disease that has sent millions of people globally to an untimely grave. World Health Organization (WHO) also confirmed this assertion and as a result, there have been series of attempts by researchers in various fields to solve this problem. Certain researchers in computer and health informatics carried out predictive analytics to detect and classify the disease based on several biomarkers identified in the affected individual. Meanwhile, enough has not been done to determine the level of susceptibility of individuals to heart diseases with concerted effort on the key indicators such as age, sex, sugar level and some other related attributes before predictive analytics are made. This explores the attribute and it was finally established that sex, age, level of cholesterol etc. are strong markers to determining the level of susceptible of patient to heart disease. Moreover, Four ML models - KNN, NB, SVM and RF were implemented and evaluated in term of their performances in the classification of heart diseases using crossvalidation and test dataset. At first, with every feature available in the dataset and later with only the correlated features identified in the descriptive analytics. It was established that accuracy improves across all models when only correlated features were used and SVM exhibits the highest accuracy and F1 Score (84%). Therefore, SVM performs better than KNN, RF and NB when all the models were evaluated on the 25% test set of the correlated features. It could be therefore concluded that in-depth understanding of features for identification of strong disease biomarkers will enhance more accurate diagnostics and this in turn will be of great assistance to the medical practitioners and other stake holders to track susceptibility of individuals with identified features to heart disease.","PeriodicalId":167914,"journal":{"name":"Journal of Applied Science, Information and Computing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Applied Science, Information and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59568/jasic-2022-3-1-03","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Clinical reports and research have established that heart diseases are a typical example of cardiovascular disease that has sent millions of people globally to an untimely grave. World Health Organization (WHO) also confirmed this assertion and as a result, there have been series of attempts by researchers in various fields to solve this problem. Certain researchers in computer and health informatics carried out predictive analytics to detect and classify the disease based on several biomarkers identified in the affected individual. Meanwhile, enough has not been done to determine the level of susceptibility of individuals to heart diseases with concerted effort on the key indicators such as age, sex, sugar level and some other related attributes before predictive analytics are made. This explores the attribute and it was finally established that sex, age, level of cholesterol etc. are strong markers to determining the level of susceptible of patient to heart disease. Moreover, Four ML models - KNN, NB, SVM and RF were implemented and evaluated in term of their performances in the classification of heart diseases using crossvalidation and test dataset. At first, with every feature available in the dataset and later with only the correlated features identified in the descriptive analytics. It was established that accuracy improves across all models when only correlated features were used and SVM exhibits the highest accuracy and F1 Score (84%). Therefore, SVM performs better than KNN, RF and NB when all the models were evaluated on the 25% test set of the correlated features. It could be therefore concluded that in-depth understanding of features for identification of strong disease biomarkers will enhance more accurate diagnostics and this in turn will be of great assistance to the medical practitioners and other stake holders to track susceptibility of individuals with identified features to heart disease.
机器学习算法在心脏病诊断和分类中的性能评价
临床报告和研究已经证实,心脏病是一种典型的心血管疾病,它已导致全球数百万人过早死亡。世界卫生组织(卫生组织)也证实了这一说法,因此,各个领域的研究人员进行了一系列尝试来解决这一问题。计算机和健康信息学的某些研究人员进行了预测分析,根据在受影响个体中识别的几种生物标志物来检测和分类疾病。与此同时,在进行预测分析之前,对年龄、性别、糖水平和其他相关属性等关键指标进行协调一致的努力,以确定个体对心脏病的易感性水平,这方面的工作还不够。通过对这一属性的探讨,最终确定了性别、年龄、胆固醇水平等是确定患者心脏病易感程度的重要标志。利用交叉验证和测试数据集,实现了KNN、NB、SVM和RF四种ML模型,并对其在心脏病分类中的性能进行了评价。起初,使用数据集中的每个可用特征,后来仅使用描述性分析中确定的相关特征。当只使用相关特征时,所有模型的准确性都有所提高,支持向量机显示出最高的准确性和F1分数(84%)。因此,当所有模型在相关特征的25%测试集上进行评估时,SVM的性能优于KNN、RF和NB。因此,可以得出结论,深入了解识别强疾病生物标志物的特征将提高更准确的诊断,这反过来将对医生和其他利益相关者追踪具有识别特征的个体对心脏病的易感性有很大帮助。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信