{"title":"Detect the Cardiovascular Disease's in Initial Phase using a Range of Feature Selection Techniques of ML","authors":"Prashant Maganlal Goad, Pramod J. Deore","doi":"10.54392/irjmt24313","DOIUrl":null,"url":null,"abstract":"Heart-related conditions remain the foremost global cause of mortality. In 2000, heart disease claimed around 14 million lives worldwide, a number that surged to approximately 620 million by 2023. The aging and expanding population significantly contribute to this rising mortality trend. However, this also underscores the potential for significant impact through early intervention, crucial for reducing fatalities from heart failure, where prevention plays a pivotal role. The aim of the present research is to develop a prospective ML framework that can detect important features and predict cardiac conditions as an early stage using a variety of choice of features strategies. The Features subsets that were chosen were designated as FST1, FST2, and FST3, respectively. Three distinct methods, including correlation-based feature selection, chi-square and mutual information, were used for picking features. Next, the most confident theory & the most appropriate feature selection were identified using six alternative machine learning models: Logistical Regression (LR) (AL1), the support vector Machine (SVM ) (AL2), K-nearest neighbor (K-NN) (AL3), Random forest (RF) model (AL4), Naive Bayes (NB) model (AL5), and Decision Tree (DT) (AL6). Ultimately, we discovered that, with 95.25% accuracy, 95.11% sensitivity, 95.23% specificity, 96.96 area below receiver operating characteristic and 0.27 log loss, the random forest model offered the most excellent results for F3 feature sets. No one has investigated coronary artery disease forecasting in depth; however, our study evaluates multiple statistics (specificity, sensitivity, accuracy, AUROC, and log loss) and uses multiple attribute choices to improve algorithms success for important features. The suggested model has considerable promise for medical use to speculate CVD find in Precursor at a minimal cost and in a shorter amount of time as well as will assist limited experience physician to take right decision based on the results of the used model combined with specific criteria.","PeriodicalId":14412,"journal":{"name":"International Research Journal of Multidisciplinary Technovation","volume":"3 8","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-05-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Research Journal of Multidisciplinary Technovation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.54392/irjmt24313","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Heart-related conditions remain the foremost global cause of mortality. In 2000, heart disease claimed around 14 million lives worldwide, a number that surged to approximately 620 million by 2023. The aging and expanding population significantly contribute to this rising mortality trend. However, this also underscores the potential for significant impact through early intervention, crucial for reducing fatalities from heart failure, where prevention plays a pivotal role. The aim of the present research is to develop a prospective ML framework that can detect important features and predict cardiac conditions as an early stage using a variety of choice of features strategies. The Features subsets that were chosen were designated as FST1, FST2, and FST3, respectively. Three distinct methods, including correlation-based feature selection, chi-square and mutual information, were used for picking features. Next, the most confident theory & the most appropriate feature selection were identified using six alternative machine learning models: Logistical Regression (LR) (AL1), the support vector Machine (SVM ) (AL2), K-nearest neighbor (K-NN) (AL3), Random forest (RF) model (AL4), Naive Bayes (NB) model (AL5), and Decision Tree (DT) (AL6). Ultimately, we discovered that, with 95.25% accuracy, 95.11% sensitivity, 95.23% specificity, 96.96 area below receiver operating characteristic and 0.27 log loss, the random forest model offered the most excellent results for F3 feature sets. No one has investigated coronary artery disease forecasting in depth; however, our study evaluates multiple statistics (specificity, sensitivity, accuracy, AUROC, and log loss) and uses multiple attribute choices to improve algorithms success for important features. The suggested model has considerable promise for medical use to speculate CVD find in Precursor at a minimal cost and in a shorter amount of time as well as will assist limited experience physician to take right decision based on the results of the used model combined with specific criteria.