{"title":"RECENT DEVELOPMENTS IN SOFT COMPUTING BASED TECHNIQUES FOR FEATURE SELECTION AND DISEASE CLASSIFICATION","authors":"Naiyar Iqbal, P. Kumar","doi":"10.55766/sujst-2023-02-e01872","DOIUrl":null,"url":null,"abstract":"Computational prediction of diseases is vital in medical research that contributes to computer-aided diagnostics and helps doctors and medical practitioners in critical decision-making for various diseases such as bacterial and viral kinds of disease, including COVID-19 of the current pandemic situation. Feature selection techniques function as a preprocessing phase for classification and prediction algorithms. For disease prediction, these features may be the patient’s clinical profiles or genomic features such as gene expression profiles from microarray and read counts from RNA-Seq. The performance of a classifier depends primarily on the selected features. In addition, genomic features are too large in numbers, resulting in the curse of dimensionality problem. In the last few years, several feature selection algorithms have been developed to overcome the existing problems to get rid of eliminating chronic diseases, such as various cancers, Zika virus, Ebola virus, and the COVID-19 pandemic. In this review article, we systematically associate soft computing-based approaches for feature selection and disease prediction by applying three data types: patients’ clinical profiles, microarray gene expression profiles, and RNA-Seq sample profiles. According to related work, when the discussion took place, the percentage of medical data types highlighted through pictorial representation and the respective ratio of percentages mentioned were 52%, 27%, 9% and 12% for clinical symptoms, gene expression, MRI-Image and other data types such as signal or text-based utilized, respectively. We also highlight the significant challenges and future directions in this research domain. \n ","PeriodicalId":43478,"journal":{"name":"Suranaree Journal of Science and Technology","volume":"4 1","pages":""},"PeriodicalIF":0.2000,"publicationDate":"2023-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Suranaree Journal of Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.55766/sujst-2023-02-e01872","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MULTIDISCIPLINARY SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Computational prediction of diseases is vital in medical research that contributes to computer-aided diagnostics and helps doctors and medical practitioners in critical decision-making for various diseases such as bacterial and viral kinds of disease, including COVID-19 of the current pandemic situation. Feature selection techniques function as a preprocessing phase for classification and prediction algorithms. For disease prediction, these features may be the patient’s clinical profiles or genomic features such as gene expression profiles from microarray and read counts from RNA-Seq. The performance of a classifier depends primarily on the selected features. In addition, genomic features are too large in numbers, resulting in the curse of dimensionality problem. In the last few years, several feature selection algorithms have been developed to overcome the existing problems to get rid of eliminating chronic diseases, such as various cancers, Zika virus, Ebola virus, and the COVID-19 pandemic. In this review article, we systematically associate soft computing-based approaches for feature selection and disease prediction by applying three data types: patients’ clinical profiles, microarray gene expression profiles, and RNA-Seq sample profiles. According to related work, when the discussion took place, the percentage of medical data types highlighted through pictorial representation and the respective ratio of percentages mentioned were 52%, 27%, 9% and 12% for clinical symptoms, gene expression, MRI-Image and other data types such as signal or text-based utilized, respectively. We also highlight the significant challenges and future directions in this research domain.