{"title":"A Hybrid Unsupervised Feature Selection Algorithm","authors":"Rana Pratap Singh, Kuldeep Singh Jadon","doi":"10.1109/CSNT51715.2021.9509674","DOIUrl":null,"url":null,"abstract":"Due to the explosion of data, a vast amount of high-dimensional data like images, texts as well as medical microarray data are generated. In addition to exponentially raising measurement storage and processing strain on algorithms & computer hardware, direct processing of high-dimensional data often results in poor performance because of irrelevant, noisy as well as duplicate dimensions. A large number of features present in the dataset used for machine intelligence purposes pose a big threat to researchers. Algorithms that use these large dimension features suffer in terms of computer time taken to make decisions and space required to store them in computer memory. In the proposed work we have developed a hybrid algorithm to select the highly discriminative features present in the dataset. Using the multicluster feature rank score and unsupervised discriminative feature ranking methods in selecting the most discriminative features, on some well-documented datasets like the ORL, we have carried out comprehensive experiments. Our experimental results have proven the superiority of our algorithms in comparison to some state-of-the-art algorithms.","PeriodicalId":122176,"journal":{"name":"2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-06-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 10th IEEE International Conference on Communication Systems and Network Technologies (CSNT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CSNT51715.2021.9509674","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Due to the explosion of data, a vast amount of high-dimensional data like images, texts as well as medical microarray data are generated. In addition to exponentially raising measurement storage and processing strain on algorithms & computer hardware, direct processing of high-dimensional data often results in poor performance because of irrelevant, noisy as well as duplicate dimensions. A large number of features present in the dataset used for machine intelligence purposes pose a big threat to researchers. Algorithms that use these large dimension features suffer in terms of computer time taken to make decisions and space required to store them in computer memory. In the proposed work we have developed a hybrid algorithm to select the highly discriminative features present in the dataset. Using the multicluster feature rank score and unsupervised discriminative feature ranking methods in selecting the most discriminative features, on some well-documented datasets like the ORL, we have carried out comprehensive experiments. Our experimental results have proven the superiority of our algorithms in comparison to some state-of-the-art algorithms.