Dynamic Ensemble Associative Learning

Md Rayhan Kabir, Osmar R Zaiane
{"title":"Dynamic Ensemble Associative Learning","authors":"Md Rayhan Kabir, Osmar R Zaiane","doi":"10.1109/ASONAM55673.2022.10068715","DOIUrl":null,"url":null,"abstract":"Associative classifiers have shown competitive performance with state-of-the-art methods for predicting class labels. In addition to accuracy performance, associative classifiers produce human readable rules for classification which provides an easier way to understand the decision process of the model. Early models of associative classifiers suffered from the limitation of selecting proper threshold values which are dataset specific. Recent work on associative classifiers eliminates that restriction by searching for statistically significant rules. However, a high dimensional feature vector in the training data impacts the performance of the model. Ensemble models like Random Forest are also very powerful tools for classification but the decision process of Random Forest is not easily understandable like the associative classifiers. In this study we propose Dynamic Ensemble Associative Learning (DEAL) where we use associative classifiers as base learners on feature sub-spaces. In our approach we select a subset of the feature vector to train each of the base learners. Instead of a random selection, we propose a dynamic feature sampling procedure which automatically defines the number of base learners and ensures diversity and completeness among the subset of feature vectors. We use 10 datasets from the UCI repository and evaluate the performance of the model in terms of accuracy and memory requirement. Our ensemble approach using the proposed sampling method largely decreases the memory requirement in the case of datasets having a large number of features and this without jeopardising accuracy. In fact, accuracy is also improved in most cases. Moreover, the decision process of our DEAL approach remains human interpretable by collecting and ranking the rules generated by the base learners predicting the final class label.","PeriodicalId":423113,"journal":{"name":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"187 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM55673.2022.10068715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Associative classifiers have shown competitive performance with state-of-the-art methods for predicting class labels. In addition to accuracy performance, associative classifiers produce human readable rules for classification which provides an easier way to understand the decision process of the model. Early models of associative classifiers suffered from the limitation of selecting proper threshold values which are dataset specific. Recent work on associative classifiers eliminates that restriction by searching for statistically significant rules. However, a high dimensional feature vector in the training data impacts the performance of the model. Ensemble models like Random Forest are also very powerful tools for classification but the decision process of Random Forest is not easily understandable like the associative classifiers. In this study we propose Dynamic Ensemble Associative Learning (DEAL) where we use associative classifiers as base learners on feature sub-spaces. In our approach we select a subset of the feature vector to train each of the base learners. Instead of a random selection, we propose a dynamic feature sampling procedure which automatically defines the number of base learners and ensures diversity and completeness among the subset of feature vectors. We use 10 datasets from the UCI repository and evaluate the performance of the model in terms of accuracy and memory requirement. Our ensemble approach using the proposed sampling method largely decreases the memory requirement in the case of datasets having a large number of features and this without jeopardising accuracy. In fact, accuracy is also improved in most cases. Moreover, the decision process of our DEAL approach remains human interpretable by collecting and ranking the rules generated by the base learners predicting the final class label.
动态集成联想学习
关联分类器已经显示出与最先进的预测类标签的方法竞争的性能。除了精度性能外,关联分类器还生成人类可读的分类规则,这为理解模型的决策过程提供了一种更容易的方法。早期的关联分类器模型受到选择特定于数据集的合适阈值的限制。最近对关联分类器的研究通过搜索统计上显著的规则消除了这种限制。然而,训练数据中的高维特征向量会影响模型的性能。像随机森林这样的集成模型也是非常强大的分类工具,但是随机森林的决策过程不像关联分类器那样容易理解。在这项研究中,我们提出了动态集成关联学习(DEAL),其中我们使用关联分类器作为特征子空间的基础学习器。在我们的方法中,我们选择特征向量的一个子集来训练每个基础学习器。本文提出了一种动态特征采样方法,该方法可以自动定义基本学习器的数量,并保证特征向量子集之间的多样性和完整性。我们使用来自UCI存储库的10个数据集,并从准确性和内存需求方面评估模型的性能。在具有大量特征的数据集的情况下,我们使用所提出的采样方法的集成方法大大降低了内存需求,并且不会影响准确性。事实上,在大多数情况下,准确性也得到了提高。此外,我们的DEAL方法的决策过程通过收集和排序由预测最终类标签的基础学习器生成的规则来保持人类可解释性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信