Using Diversity for Classifier Ensemble Pruning: An Empirical Investigation

M. A. O. Ahmed, Luca Didaci, Bahram Lavi, G. Fumera
{"title":"Using Diversity for Classifier Ensemble Pruning: An Empirical Investigation","authors":"M. A. O. Ahmed, Luca Didaci, Bahram Lavi, G. Fumera","doi":"10.20904/291-2025","DOIUrl":null,"url":null,"abstract":"The concept of `diversity' has been one of the main open issues in the field of multiple classifier systems. In this paper we address a facet of diversity related to its effectiveness for ensemble construction, namely, explicitly using diversity measures for ensemble construction techniques based on the kind of overproduce and choose strategy known as ensemble pruning. Such a strategy consists of selecting the (hopefully) more accurate subset of classifiers out of an original, larger ensemble. Whereas several existing pruning methods use some combination of individual classifiers' accuracy and diversity, it is still unclear whether such an evaluation function is better than the bare estimate of ensemble accuracy. We empirically investigate this issue by comparing two evaluation functions in the context of ensemble pruning: the estimate of ensemble accuracy, and its linear combination with several well-known diversity measures. This can also be viewed as using diversity as a regularizer, as suggested by some authors. To this aim we use a pruning method based on forward selection, since it allows a direct comparison between different evaluation functions. Experiments on thirty-seven benchmark data sets, four diversity measures and three base classifiers provide evidence that using diversity measures for ensemble pruning can be advantageous over using only ensemble accuracy, and that diversity measures can act as regularizers in this context.","PeriodicalId":413417,"journal":{"name":"Theoretical and Applied Informatics","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Theoretical and Applied Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.20904/291-2025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

The concept of `diversity' has been one of the main open issues in the field of multiple classifier systems. In this paper we address a facet of diversity related to its effectiveness for ensemble construction, namely, explicitly using diversity measures for ensemble construction techniques based on the kind of overproduce and choose strategy known as ensemble pruning. Such a strategy consists of selecting the (hopefully) more accurate subset of classifiers out of an original, larger ensemble. Whereas several existing pruning methods use some combination of individual classifiers' accuracy and diversity, it is still unclear whether such an evaluation function is better than the bare estimate of ensemble accuracy. We empirically investigate this issue by comparing two evaluation functions in the context of ensemble pruning: the estimate of ensemble accuracy, and its linear combination with several well-known diversity measures. This can also be viewed as using diversity as a regularizer, as suggested by some authors. To this aim we use a pruning method based on forward selection, since it allows a direct comparison between different evaluation functions. Experiments on thirty-seven benchmark data sets, four diversity measures and three base classifiers provide evidence that using diversity measures for ensemble pruning can be advantageous over using only ensemble accuracy, and that diversity measures can act as regularizers in this context.
利用多样性进行分类器集成剪枝的实证研究
“多样性”的概念一直是多分类系统领域的主要开放性问题之一。在本文中,我们讨论了与集成构建有效性相关的多样性的一个方面,即明确地使用基于过度生产和选择策略(称为集成修剪)的集成构建技术的多样性度量。这种策略包括从原始的、更大的集合中选择(希望)更准确的分类器子集。尽管现有的几种修剪方法使用了单个分类器的精度和多样性的某种组合,但这种评估函数是否优于单纯的集成精度估计,目前还不清楚。我们通过比较集成剪枝中的两个评价函数:集成精度的估计,以及它与几个众所周知的多样性度量的线性组合,对这一问题进行了实证研究。正如一些作者所建议的那样,这也可以被视为使用多样性作为正则化器。为此,我们使用基于前向选择的修剪方法,因为它允许在不同的评估函数之间进行直接比较。在37个基准数据集、4个多样性度量和3个基本分类器上的实验证明,使用多样性度量进行集成修剪比只使用集成精度更有利,并且多样性度量在这种情况下可以充当正则化器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信