{"title":"A Supervised Feature Selection Algorithm through Minimum Spanning Tree Clustering","authors":"Qin Liu, Jingxiao Zhang, Jiakai Xiao, Hongming Zhu, Qinpei Zhao","doi":"10.1109/ICTAI.2014.47","DOIUrl":null,"url":null,"abstract":"In different types of feature selection algorithms, feature clustering is an emerging subset generation paradigm. In this paper, a Minimum spanning tree based Feature Clustering (MFC) algorithm is proposed. In the algorithm, an information-theoretic based measure, i.e., Variation of information, is utilized as the feature redundancy and relevance metric. At the clustering phase, the sum of pair wise feature redundancy is minimized. Then, a representative feature is selected from each cluster, where the relevance between representative features and the target label is maximized. The algorithm is supervised since it is designed for various supervised learning problems, such as classification and regression. The proposed MFC is compared with three conventional feature selection algorithms, two of which are also feature clustering method. The MFC obtains well separated feature clusters in the experiment and considerable better classification accuracies applied on several real data sets.","PeriodicalId":142794,"journal":{"name":"2014 IEEE 26th International Conference on Tools with Artificial Intelligence","volume":"189 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 26th International Conference on Tools with Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAI.2014.47","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
In different types of feature selection algorithms, feature clustering is an emerging subset generation paradigm. In this paper, a Minimum spanning tree based Feature Clustering (MFC) algorithm is proposed. In the algorithm, an information-theoretic based measure, i.e., Variation of information, is utilized as the feature redundancy and relevance metric. At the clustering phase, the sum of pair wise feature redundancy is minimized. Then, a representative feature is selected from each cluster, where the relevance between representative features and the target label is maximized. The algorithm is supervised since it is designed for various supervised learning problems, such as classification and regression. The proposed MFC is compared with three conventional feature selection algorithms, two of which are also feature clustering method. The MFC obtains well separated feature clusters in the experiment and considerable better classification accuracies applied on several real data sets.