集成过滤特征选择模型中聚合方法的效率

Transactions on Machine Learning and Artificial Intelligence Pub Date : 2021-08-17 DOI:10.14738/tmlai.94.10101

N. Noureldien, Saffa Mohmoud

{"title":"集成过滤特征选择模型中聚合方法的效率","authors":"N. Noureldien, Saffa Mohmoud","doi":"10.14738/tmlai.94.10101","DOIUrl":null,"url":null,"abstract":"Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classiﬁcation accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question. \n In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes. \n The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.","PeriodicalId":119801,"journal":{"name":"Transactions on Machine Learning and Artificial Intelligence","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models\",\"authors\":\"N. Noureldien, Saffa Mohmoud\",\"doi\":\"10.14738/tmlai.94.10101\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classiﬁcation accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question. \\n In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes. \\n The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.\",\"PeriodicalId\":119801,\"journal\":{\"name\":\"Transactions on Machine Learning and Artificial Intelligence\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-08-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Transactions on Machine Learning and Artificial Intelligence\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14738/tmlai.94.10101\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Transactions on Machine Learning and Artificial Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14738/tmlai.94.10101","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

推荐使用集成特征选择，因为与单独的特征选择方法相比，集成特征选择可以产生更稳定的特征子集和更好的分类精度。在这种方法中，特征选择方法的输出(称为基选择器)使用一些聚合方法进行组合。对于过滤器特征选择方法，需要使用列表聚合方法将输出的排序列表聚合为单个列表，并且由于已经提出了许多列表聚合方法，因此使用哪种方法构建最佳集成模型实际上是一个问题。本文研究了四种聚合方法的效率，即;最小值，中位数，算术平均值和几何平均值。使用来自不同科学领域的五个具有不同实例数量和特征的数据集来评估聚合方法的性能。此外，在评估中使用的分类是从三个不同的类别中选择的，树，规则和贝叶斯。实验结果表明，15个最佳性能结果中有11个对应于集成模型。在11个性能最佳的集成模型中，最有效的聚合方法是Median(5/11)，其次是Arithmetic Mean(3/11)和Min(3/11)。结果表明，随着特征数量的增加，有效的聚合方法从最小值到中位数再到算术平均。这可能表明，对于非常多的特征，有效的聚合方法是算术平均值。一般来说，没有一种聚合方法对所有情况都是最好的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Efficiency of Aggregation Methods in Ensemble Filter Feature Selection Models

Ensemble feature selection is recommended as it proves to produce a more stable subset of features and a better classiﬁcation accuracy when compared to the individual feature selection methods. In this approach, the output of feature selection methods, called base selectors, are combined using some aggregation methods. For filter feature selection methods, a list aggregation method is needed to aggregate output ranked lists into a single list, and since many list aggregation methods have been proposed the decision on which method to use to build the optimum ensemble model is a de facto question. In this paper, we investigate the efficiency of four aggregation methods, namely; Min, Median, Arithmetic Mean, and Geometric Mean. The performance of aggregation methods is evaluated using five datasets from different scientific fields with a variant number of instances and features. Besides, the classifies used in the evaluation are selected from three different classes, Trees, Rules, and Bayes. The experimental results show that 11 out of the 15 best performance results are corresponding to ensemble models. And out of the 11 best performance ensemble models, the most efficient aggregation methods are Median (5/11), followed by Arithmetic Mean (3/11) and Min (3/11). Also, results show that as the number of features increased, the efficient aggregation method changes from Min to Median to Arithmetic Mean. This may suggest that for a very high number of features the efficient aggregation method is the Arithmetic Mean. And generally, there is no aggregation method that is the best for all cases.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Transactions on Machine Learning and Artificial Intelligence

自引率

0.00%

发文量