Detect the Cardiovascular Disease's in Initial Phase using a Range of Feature Selection Techniques of ML

International Research Journal of Multidisciplinary Technovation Pub Date : 2024-05-14 DOI:10.54392/irjmt24313

Prashant Maganlal Goad, Pramod J. Deore

{"title":"Detect the Cardiovascular Disease's in Initial Phase using a Range of Feature Selection Techniques of ML","authors":"Prashant Maganlal Goad, Pramod J. Deore","doi":"10.54392/irjmt24313","DOIUrl":null,"url":null,"abstract":"Heart-related conditions remain the foremost global cause of mortality. In 2000, heart disease claimed around 14 million lives worldwide, a number that surged to approximately 620 million by 2023. The aging and expanding population significantly contribute to this rising mortality trend. However, this also underscores the potential for significant impact through early intervention, crucial for reducing fatalities from heart failure, where prevention plays a pivotal role. The aim of the present research is to develop a prospective ML framework that can detect important features and predict cardiac conditions as an early stage using a variety of choice of features strategies. The Features subsets that were chosen were designated as FST1, FST2, and FST3, respectively. Three distinct methods, including correlation-based feature selection, chi-square and mutual information, were used for picking features. Next, the most confident theory & the most appropriate feature selection were identified using six alternative machine learning models: Logistical Regression (LR) (AL1), the support vector Machine (SVM ) (AL2), K-nearest neighbor (K-NN) (AL3), Random forest (RF) model (AL4), Naive Bayes (NB) model (AL5), and Decision Tree (DT) (AL6). Ultimately, we discovered that, with 95.25% accuracy, 95.11% sensitivity, 95.23% specificity, 96.96 area below receiver operating characteristic and 0.27 log loss, the random forest model offered the most excellent results for F3 feature sets. No one has investigated coronary artery disease forecasting in depth; however, our study evaluates multiple statistics (specificity, sensitivity, accuracy, AUROC, and log loss) and uses multiple attribute choices to improve algorithms success for important features. The suggested model has considerable promise for medical use to speculate CVD find in Precursor at a minimal cost and in a shorter amount of time as well as will assist limited experience physician to take right decision based on the results of the used model combined with specific criteria.","PeriodicalId":14412,"journal":{"name":"International Research Journal of Multidisciplinary Technovation","volume":"3 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Research Journal of Multidisciplinary Technovation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54392/irjmt24313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Heart-related conditions remain the foremost global cause of mortality. In 2000, heart disease claimed around 14 million lives worldwide, a number that surged to approximately 620 million by 2023. The aging and expanding population significantly contribute to this rising mortality trend. However, this also underscores the potential for significant impact through early intervention, crucial for reducing fatalities from heart failure, where prevention plays a pivotal role. The aim of the present research is to develop a prospective ML framework that can detect important features and predict cardiac conditions as an early stage using a variety of choice of features strategies. The Features subsets that were chosen were designated as FST1, FST2, and FST3, respectively. Three distinct methods, including correlation-based feature selection, chi-square and mutual information, were used for picking features. Next, the most confident theory & the most appropriate feature selection were identified using six alternative machine learning models: Logistical Regression (LR) (AL1), the support vector Machine (SVM ) (AL2), K-nearest neighbor (K-NN) (AL3), Random forest (RF) model (AL4), Naive Bayes (NB) model (AL5), and Decision Tree (DT) (AL6). Ultimately, we discovered that, with 95.25% accuracy, 95.11% sensitivity, 95.23% specificity, 96.96 area below receiver operating characteristic and 0.27 log loss, the random forest model offered the most excellent results for F3 feature sets. No one has investigated coronary artery disease forecasting in depth; however, our study evaluates multiple statistics (specificity, sensitivity, accuracy, AUROC, and log loss) and uses multiple attribute choices to improve algorithms success for important features. The suggested model has considerable promise for medical use to speculate CVD find in Precursor at a minimal cost and in a shorter amount of time as well as will assist limited experience physician to take right decision based on the results of the used model combined with specific criteria.

查看原文本刊更多论文

使用一系列多模型特征选择技术在初始阶段检测心血管疾病

与心脏有关的疾病仍然是全球最主要的死亡原因。2000 年，心脏病夺走了全球约 1 400 万人的生命，到 2023 年，这一数字将激增至约 6.2 亿人。人口老龄化和人口膨胀是死亡率上升的重要原因。然而，这也凸显了通过早期干预产生重大影响的潜力，而早期干预对于降低心力衰竭致死率至关重要，预防在其中发挥着举足轻重的作用。本研究的目的是开发一个前瞻性 ML 框架，该框架可检测重要特征，并利用各种特征选择策略在早期预测心脏状况。选择的特征子集分别称为 FST1、FST2 和 FST3。在选择特征时使用了三种不同的方法，包括基于相关性的特征选择、卡方法和互信息法。接下来，使用六种可供选择的机器学习模型确定了最有把握的理论和最合适的特征选择：逻辑回归（LR）（AL1）、支持向量机（SVM）（AL2）、K-近邻（K-NN）（AL3）、随机森林（RF）模型（AL4）、奈夫贝叶斯（NB）模型（AL5）和决策树（DT）（AL6）。最终，我们发现，随机森林模型的准确率为 95.25%，灵敏度为 95.11%，特异性为 95.23%，接收器工作特征下方面积为 96.96，对数损失为 0.27，在 F3 特征集方面取得了最出色的结果。目前还没有人对冠状动脉疾病预测进行过深入研究；不过，我们的研究评估了多种统计数据（特异性、灵敏度、准确性、AUROC 和对数损失），并使用多种属性选择来提高重要特征算法的成功率。所建议的模型在医学上有很大的应用前景，能以最低的成本、最短的时间推测先兆心血管疾病的发现，并能帮助经验有限的医生根据所使用模型的结果结合特定的标准做出正确的决定。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Research Journal of Multidisciplinary Technovation

CiteScore

0.50

自引率

0.00%

发文量