{"title":"基于t统计量的特征子集选择提高稳定性","authors":"R. Karthika","doi":"10.1109/ICRTCCM.2017.39","DOIUrl":null,"url":null,"abstract":"Large amounts of data gets accumulated and stored in the databases in day to day life that are high dimensional in nature. The data mining task is used to excavate the useful information from the high dimensional data. To classify or cluster the high dimensional data, the dimensionality of the data needs to be reduced. Feature selection is used to select the features that are relevant to the analysis and discards the features that are not relevant as well as redundant. There are so many feature subset selection algorithms available. In this paper, we evaluate the stability of the subset of the features selected using a measure called T-Statistic and improve the prediction accuracy of the classifier using Booster.","PeriodicalId":134897,"journal":{"name":"2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Efficient Feature Subset Selection for Improved Stability Using T-Statistic\",\"authors\":\"R. Karthika\",\"doi\":\"10.1109/ICRTCCM.2017.39\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large amounts of data gets accumulated and stored in the databases in day to day life that are high dimensional in nature. The data mining task is used to excavate the useful information from the high dimensional data. To classify or cluster the high dimensional data, the dimensionality of the data needs to be reduced. Feature selection is used to select the features that are relevant to the analysis and discards the features that are not relevant as well as redundant. There are so many feature subset selection algorithms available. In this paper, we evaluate the stability of the subset of the features selected using a measure called T-Statistic and improve the prediction accuracy of the classifier using Booster.\",\"PeriodicalId\":134897,\"journal\":{\"name\":\"2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICRTCCM.2017.39\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Second International Conference on Recent Trends and Challenges in Computational Models (ICRTCCM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRTCCM.2017.39","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Efficient Feature Subset Selection for Improved Stability Using T-Statistic
Large amounts of data gets accumulated and stored in the databases in day to day life that are high dimensional in nature. The data mining task is used to excavate the useful information from the high dimensional data. To classify or cluster the high dimensional data, the dimensionality of the data needs to be reduced. Feature selection is used to select the features that are relevant to the analysis and discards the features that are not relevant as well as redundant. There are so many feature subset selection algorithms available. In this paper, we evaluate the stability of the subset of the features selected using a measure called T-Statistic and improve the prediction accuracy of the classifier using Booster.