{"title":"Dynamic Ensemble Associative Learning","authors":"Md Rayhan Kabir, Osmar R Zaiane","doi":"10.1109/ASONAM55673.2022.10068715","DOIUrl":null,"url":null,"abstract":"Associative classifiers have shown competitive performance with state-of-the-art methods for predicting class labels. In addition to accuracy performance, associative classifiers produce human readable rules for classification which provides an easier way to understand the decision process of the model. Early models of associative classifiers suffered from the limitation of selecting proper threshold values which are dataset specific. Recent work on associative classifiers eliminates that restriction by searching for statistically significant rules. However, a high dimensional feature vector in the training data impacts the performance of the model. Ensemble models like Random Forest are also very powerful tools for classification but the decision process of Random Forest is not easily understandable like the associative classifiers. In this study we propose Dynamic Ensemble Associative Learning (DEAL) where we use associative classifiers as base learners on feature sub-spaces. In our approach we select a subset of the feature vector to train each of the base learners. Instead of a random selection, we propose a dynamic feature sampling procedure which automatically defines the number of base learners and ensures diversity and completeness among the subset of feature vectors. We use 10 datasets from the UCI repository and evaluate the performance of the model in terms of accuracy and memory requirement. Our ensemble approach using the proposed sampling method largely decreases the memory requirement in the case of datasets having a large number of features and this without jeopardising accuracy. In fact, accuracy is also improved in most cases. Moreover, the decision process of our DEAL approach remains human interpretable by collecting and ranking the rules generated by the base learners predicting the final class label.","PeriodicalId":423113,"journal":{"name":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","volume":"187 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASONAM55673.2022.10068715","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Associative classifiers have shown competitive performance with state-of-the-art methods for predicting class labels. In addition to accuracy performance, associative classifiers produce human readable rules for classification which provides an easier way to understand the decision process of the model. Early models of associative classifiers suffered from the limitation of selecting proper threshold values which are dataset specific. Recent work on associative classifiers eliminates that restriction by searching for statistically significant rules. However, a high dimensional feature vector in the training data impacts the performance of the model. Ensemble models like Random Forest are also very powerful tools for classification but the decision process of Random Forest is not easily understandable like the associative classifiers. In this study we propose Dynamic Ensemble Associative Learning (DEAL) where we use associative classifiers as base learners on feature sub-spaces. In our approach we select a subset of the feature vector to train each of the base learners. Instead of a random selection, we propose a dynamic feature sampling procedure which automatically defines the number of base learners and ensures diversity and completeness among the subset of feature vectors. We use 10 datasets from the UCI repository and evaluate the performance of the model in terms of accuracy and memory requirement. Our ensemble approach using the proposed sampling method largely decreases the memory requirement in the case of datasets having a large number of features and this without jeopardising accuracy. In fact, accuracy is also improved in most cases. Moreover, the decision process of our DEAL approach remains human interpretable by collecting and ranking the rules generated by the base learners predicting the final class label.