{"title":"Selecting Dominant Features for the Prediction of Early-Stage Chronic Kidney Disease","authors":"Vinothini Arumugam, S. Baghavathi Priya","doi":"10.32604/iasc.2022.018654","DOIUrl":null,"url":null,"abstract":"Nowadays, Chronic Kidney Disease (CKD) is one of the vigorous public health diseases. Hence, early detection of the disease may reduce the severity of its consequences. Besides, medical databases of any disease diagnosis may be collected from the blood test, urine test, and patient history. Nevertheless, medical information retrieved from various sources is diverse. Therefore, it is unadaptable to evaluate numerical and nominal features using the same feature selection algorithm, which may lead to fallacious analysis. Applying machine learning techniques over the medical database is a common way to help feature identification for CKD prediction. In this paper, a novel Mixed Data Feature Selection (MDFS) model is proposed to select and filter preeminent features from the medical dataset for earlier CKD prediction, where CKD clinical data with 12 numerical and 12 nominal features are fed to the MDFS model. For each feature in the mixed dataset, the model applies feature selection methods according to the data type of the feature. Point Biserial correlation and a Chi-square filter are applied to filter the numerical features and nominal features, respectively. Meanwhile, an SVM algorithm is employed to evaluate and select the best feature subset. In our experimental results, the proposed MDFS model performs superior to existing works in terms of accuracy and the number of reduced features. The identified feature subset is also demonstrated to preserve its original properties without discretization during feature selection.","PeriodicalId":50357,"journal":{"name":"Intelligent Automation and Soft Computing","volume":"5 1","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Intelligent Automation and Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.32604/iasc.2022.018654","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Computer Science","Score":null,"Total":0}
引用次数: 1
Abstract
Nowadays, Chronic Kidney Disease (CKD) is one of the vigorous public health diseases. Hence, early detection of the disease may reduce the severity of its consequences. Besides, medical databases of any disease diagnosis may be collected from the blood test, urine test, and patient history. Nevertheless, medical information retrieved from various sources is diverse. Therefore, it is unadaptable to evaluate numerical and nominal features using the same feature selection algorithm, which may lead to fallacious analysis. Applying machine learning techniques over the medical database is a common way to help feature identification for CKD prediction. In this paper, a novel Mixed Data Feature Selection (MDFS) model is proposed to select and filter preeminent features from the medical dataset for earlier CKD prediction, where CKD clinical data with 12 numerical and 12 nominal features are fed to the MDFS model. For each feature in the mixed dataset, the model applies feature selection methods according to the data type of the feature. Point Biserial correlation and a Chi-square filter are applied to filter the numerical features and nominal features, respectively. Meanwhile, an SVM algorithm is employed to evaluate and select the best feature subset. In our experimental results, the proposed MDFS model performs superior to existing works in terms of accuracy and the number of reduced features. The identified feature subset is also demonstrated to preserve its original properties without discretization during feature selection.
期刊介绍:
An International Journal seeks to provide a common forum for the dissemination of accurate results about the world of intelligent automation, artificial intelligence, computer science, control, intelligent data science, modeling and systems engineering. It is intended that the articles published in the journal will encompass both the short and the long term effects of soft computing and other related fields such as robotics, control, computer, vision, speech recognition, pattern recognition, data mining, big data, data analytics, machine intelligence, cyber security and deep learning. It further hopes it will address the existing and emerging relationships between automation, systems engineering, system of systems engineering and soft computing. The journal will publish original and survey papers on artificial intelligence, intelligent automation and computer engineering with an emphasis on current and potential applications of soft computing. It will have a broad interest in all engineering disciplines, computer science, and related technological fields such as medicine, biology operations research, technology management, agriculture and information technology.