{"title":"Assessing Feature Selection Techniques for Machine Learning Models using Cardiac Dataset","authors":"Shital Patil, Surendra Bhosale","doi":"10.1109/AIKE55402.2022.00027","DOIUrl":null,"url":null,"abstract":"Cardiac disorders are the leading causes of morbidity and mortality in the world, accounting for a large number of deaths over the last few decades, and have emerged as the most life-threatening disease globally. Machine learning and Artificial intelligence have been playing key role in predicting the heart diseases. A relevant set of feature can be very helpful in predicting the disease accurately. In this study, we proposed a comparative analysis of 4 different features selection methods and evaluated their performance with both raw (Unbalanced dataset) and sampled (Balanced) dataset. The publicly available Z-Alizadeh Sani dataset have been used for this study. Four different feature selection techniques: Data Analysis, minimum Redundancy maximum Relevance (mRMR), and Recursive Feature Elimination (RFE) are used in this study. These methods are tested with 8 different classification models to get the best accuracy possible. Using balanced and unbalanced dataset, the study shows promising results in terms of various performance metrics in accurately predicting heart disease. Experimental results obtained by the proposed method with the raw data obtains maximum AUC of 100%, maximum F1 score of 94%, maximum SENS of 98%, maximum precision (PREC) of 93%. While with the balanced dataset obtained results are, maximum AUC of 100%, F1-score 95%, maximum SENS of 95%, maximum PREC of 97%.","PeriodicalId":441077,"journal":{"name":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE Fifth International Conference on Artificial Intelligence and Knowledge Engineering (AIKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/AIKE55402.2022.00027","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Cardiac disorders are the leading causes of morbidity and mortality in the world, accounting for a large number of deaths over the last few decades, and have emerged as the most life-threatening disease globally. Machine learning and Artificial intelligence have been playing key role in predicting the heart diseases. A relevant set of feature can be very helpful in predicting the disease accurately. In this study, we proposed a comparative analysis of 4 different features selection methods and evaluated their performance with both raw (Unbalanced dataset) and sampled (Balanced) dataset. The publicly available Z-Alizadeh Sani dataset have been used for this study. Four different feature selection techniques: Data Analysis, minimum Redundancy maximum Relevance (mRMR), and Recursive Feature Elimination (RFE) are used in this study. These methods are tested with 8 different classification models to get the best accuracy possible. Using balanced and unbalanced dataset, the study shows promising results in terms of various performance metrics in accurately predicting heart disease. Experimental results obtained by the proposed method with the raw data obtains maximum AUC of 100%, maximum F1 score of 94%, maximum SENS of 98%, maximum precision (PREC) of 93%. While with the balanced dataset obtained results are, maximum AUC of 100%, F1-score 95%, maximum SENS of 95%, maximum PREC of 97%.