{"title":"基于冗余的高维数据无监督特征选择方法","authors":"Jian Zhou, Ding Liu","doi":"10.1145/3457682.3457725","DOIUrl":null,"url":null,"abstract":"Feature selection is a process to select key features from the initial feature set. It is commonly used as a preprocessing step to improve the efficiency and accuracy of a classification model in artificial intelligence and machine learning domains. This paper proposes a redundancy based unsupervised feature selection method for high-dimensional data called as RUFS. Firstly, RUFS roughly descending order the features by the average SU with the other features. Secondly, RUFS orderly check each feature to decide whether it is redundant or not. Finally, it selects the proper feature subset by repeating the second step until all the features are checked. After key features are selected, the research implements classifiers to check the quality of the selected feature subset. Compared with the other existing methods, the proposed RUFS method improves the mean classification of 11 real datasets by 8.1 percent at least.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Redundancy Based Unsupervised Feature Selection Method for High-Dimensional Data\",\"authors\":\"Jian Zhou, Ding Liu\",\"doi\":\"10.1145/3457682.3457725\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection is a process to select key features from the initial feature set. It is commonly used as a preprocessing step to improve the efficiency and accuracy of a classification model in artificial intelligence and machine learning domains. This paper proposes a redundancy based unsupervised feature selection method for high-dimensional data called as RUFS. Firstly, RUFS roughly descending order the features by the average SU with the other features. Secondly, RUFS orderly check each feature to decide whether it is redundant or not. Finally, it selects the proper feature subset by repeating the second step until all the features are checked. After key features are selected, the research implements classifiers to check the quality of the selected feature subset. Compared with the other existing methods, the proposed RUFS method improves the mean classification of 11 real datasets by 8.1 percent at least.\",\"PeriodicalId\":142045,\"journal\":{\"name\":\"2021 13th International Conference on Machine Learning and Computing\",\"volume\":\"68 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Machine Learning and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3457682.3457725\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457725","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A Redundancy Based Unsupervised Feature Selection Method for High-Dimensional Data
Feature selection is a process to select key features from the initial feature set. It is commonly used as a preprocessing step to improve the efficiency and accuracy of a classification model in artificial intelligence and machine learning domains. This paper proposes a redundancy based unsupervised feature selection method for high-dimensional data called as RUFS. Firstly, RUFS roughly descending order the features by the average SU with the other features. Secondly, RUFS orderly check each feature to decide whether it is redundant or not. Finally, it selects the proper feature subset by repeating the second step until all the features are checked. After key features are selected, the research implements classifiers to check the quality of the selected feature subset. Compared with the other existing methods, the proposed RUFS method improves the mean classification of 11 real datasets by 8.1 percent at least.