{"title":"Intelligent Feature Selection on Multivariate Dataset using Advanced Data Profiling","authors":"Ritu Chaturvedi, Vandana V. Patnaik","doi":"10.1109/iemtronics55184.2022.9795745","DOIUrl":null,"url":null,"abstract":"The differential diagnosis of diseases which share similar clinical features is a real and difficult problem in medicine. This paper demonstrates the use of data mining (DM) techniques to augment standard data profiling methods and establishes an efficient approach for an intelligent feature selection method for disease that share similar features. The results from experiments returned show that by using DM techniques to select features as an additional layer on top of data profiling, there is considerable improvement in the performance of the prediction model built to predict a disease such as \"Psoriasis\". A brief comparison between features selected by existing mining tools such as Weka and the proposed approach with respect to predictive accuracy is recorded in this paper. The proposed algorithm works as a promising tool for assisting diagnosis of disease like erythemato-squamous diseases, where the symptoms are overlapping. By combining data cleansing and knowledge discovery techniques, the algorithm aims to be \"agnostic\" and can be used on a wide variety of data standards with variable data quality. 1","PeriodicalId":442879,"journal":{"name":"2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International IOT, Electronics and Mechatronics Conference (IEMTRONICS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/iemtronics55184.2022.9795745","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The differential diagnosis of diseases which share similar clinical features is a real and difficult problem in medicine. This paper demonstrates the use of data mining (DM) techniques to augment standard data profiling methods and establishes an efficient approach for an intelligent feature selection method for disease that share similar features. The results from experiments returned show that by using DM techniques to select features as an additional layer on top of data profiling, there is considerable improvement in the performance of the prediction model built to predict a disease such as "Psoriasis". A brief comparison between features selected by existing mining tools such as Weka and the proposed approach with respect to predictive accuracy is recorded in this paper. The proposed algorithm works as a promising tool for assisting diagnosis of disease like erythemato-squamous diseases, where the symptoms are overlapping. By combining data cleansing and knowledge discovery techniques, the algorithm aims to be "agnostic" and can be used on a wide variety of data standards with variable data quality. 1