{"title":"An Efficient Feature Subset Selection for Improved Stability Using T-Statistic","authors":"R. Karthika","doi":"10.1109/ICRTCCM.2017.39","DOIUrl":null,"url":null,"abstract":"Large amounts of data gets accumulated and stored in the databases in day to day life that are high dimensional in nature. The data mining task is used to excavate the useful information from the high dimensional data. To classify or cluster the high dimensional data, the dimensionality of the data needs to be reduced. Feature selection is used to select the features that are relevant to the analysis and discards the features that are not relevant as well as redundant. There are so many feature subset selection algorithms available. In this paper, we evaluate the stability of the subset of the features selected using a measure called T-Statistic and improve the prediction accuracy of the classifier using Booster.","PeriodicalId":134897,"journal":{"name":"2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRTCCM.2017.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Large amounts of data gets accumulated and stored in the databases in day to day life that are high dimensional in nature. The data mining task is used to excavate the useful information from the high dimensional data. To classify or cluster the high dimensional data, the dimensionality of the data needs to be reduced. Feature selection is used to select the features that are relevant to the analysis and discards the features that are not relevant as well as redundant. There are so many feature subset selection algorithms available. In this paper, we evaluate the stability of the subset of the features selected using a measure called T-Statistic and improve the prediction accuracy of the classifier using Booster.