{"title":"最大化零件互信息的特征选择","authors":"Wanfu Gao, Liang Hu, Ping Zhang","doi":"10.1145/3297067.3297068","DOIUrl":null,"url":null,"abstract":"Feature selection is an important preprocessing stage in signal processing and machine learning. Feature selection methods choose the most informative feature subset for classification. Mutual information and conditional mutual information are used extensively in feature selection methods. However, mutual information suffers from an overestimation problem, with conditional mutual information suffering from a problem of underestimation. To address the issues of overestimation and underestimation, we introduce a new measure named part mutual information that could accurately quantify direct association among variables. The proposed method selects the maximal value of cumulative summation of the part mutual information between candidate features and class labels when each selected feature is known. To evaluate the classification performance of the proposed method, our method is compared with four state-of the-art feature selection methods on twelve real-world data sets. Extensive studies demonstrate that our method outperforms the four compared methods in terms of average classification accuracy and the highest classification accuracy.","PeriodicalId":340004,"journal":{"name":"International Conference on Signal Processing and Machine Learning","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Feature Selection by Maximizing Part Mutual Information\",\"authors\":\"Wanfu Gao, Liang Hu, Ping Zhang\",\"doi\":\"10.1145/3297067.3297068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Feature selection is an important preprocessing stage in signal processing and machine learning. Feature selection methods choose the most informative feature subset for classification. Mutual information and conditional mutual information are used extensively in feature selection methods. However, mutual information suffers from an overestimation problem, with conditional mutual information suffering from a problem of underestimation. To address the issues of overestimation and underestimation, we introduce a new measure named part mutual information that could accurately quantify direct association among variables. The proposed method selects the maximal value of cumulative summation of the part mutual information between candidate features and class labels when each selected feature is known. To evaluate the classification performance of the proposed method, our method is compared with four state-of the-art feature selection methods on twelve real-world data sets. Extensive studies demonstrate that our method outperforms the four compared methods in terms of average classification accuracy and the highest classification accuracy.\",\"PeriodicalId\":340004,\"journal\":{\"name\":\"International Conference on Signal Processing and Machine Learning\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Conference on Signal Processing and Machine Learning\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3297067.3297068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Conference on Signal Processing and Machine Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3297067.3297068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Feature Selection by Maximizing Part Mutual Information
Feature selection is an important preprocessing stage in signal processing and machine learning. Feature selection methods choose the most informative feature subset for classification. Mutual information and conditional mutual information are used extensively in feature selection methods. However, mutual information suffers from an overestimation problem, with conditional mutual information suffering from a problem of underestimation. To address the issues of overestimation and underestimation, we introduce a new measure named part mutual information that could accurately quantify direct association among variables. The proposed method selects the maximal value of cumulative summation of the part mutual information between candidate features and class labels when each selected feature is known. To evaluate the classification performance of the proposed method, our method is compared with four state-of the-art feature selection methods on twelve real-world data sets. Extensive studies demonstrate that our method outperforms the four compared methods in terms of average classification accuracy and the highest classification accuracy.