频繁关联规则分类

IF 0.4 Q4 COMPUTER SCIENCE, INFORMATION SYSTEMS
Md Rayhan Kabir, Osmar Zaiane
{"title":"频繁关联规则分类","authors":"Md Rayhan Kabir, Osmar Zaiane","doi":"10.1145/3555776.3577848","DOIUrl":null,"url":null,"abstract":"Over the last two decades, Associative Classifiers have shown competitive performance in the task of predicting class labels. Along with the performance in accuracy, associative classifiers produce human-readable predictive rules which is very helpful to understand the decision process of the classifiers. Associative classifiers from early days suffer from the limitation requiring proper threshold value setting which is dataset-specific. Recently some studies eliminated that limitation by producing statistically significant rules. Though recent models showed very competitive performance with state-of-the-art classifiers, their performance is still impacted if the feature vector of the training data is very large. An ensemble model can solve this issue by training each base learner with a subset of the feature vector. In this study, we propose an ensemble model Classification by Frequent Association Rules (CFAR) using associative classifiers as base learners. In our approach, instead of using a classical ensemble and a voting method, we rank the generated rules based on predominance among base learners and select a subset of the rules for predicting class labels. We use 10 datasets from the UCI repository to evaluate the performance of the proposed model. Our ensemble approach CFAR eliminates the limitation of high memory requirement and runtime of recent associative classifiers if training datasets have large feature vectors. Among the datasets we used, along with increasing accuracy in most cases, CFAR removes the noisy rules which enhances the interpretability of the model.","PeriodicalId":42971,"journal":{"name":"Applied Computing Review","volume":null,"pages":null},"PeriodicalIF":0.4000,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Classification by Frequent Association Rules\",\"authors\":\"Md Rayhan Kabir, Osmar Zaiane\",\"doi\":\"10.1145/3555776.3577848\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Over the last two decades, Associative Classifiers have shown competitive performance in the task of predicting class labels. Along with the performance in accuracy, associative classifiers produce human-readable predictive rules which is very helpful to understand the decision process of the classifiers. Associative classifiers from early days suffer from the limitation requiring proper threshold value setting which is dataset-specific. Recently some studies eliminated that limitation by producing statistically significant rules. Though recent models showed very competitive performance with state-of-the-art classifiers, their performance is still impacted if the feature vector of the training data is very large. An ensemble model can solve this issue by training each base learner with a subset of the feature vector. In this study, we propose an ensemble model Classification by Frequent Association Rules (CFAR) using associative classifiers as base learners. In our approach, instead of using a classical ensemble and a voting method, we rank the generated rules based on predominance among base learners and select a subset of the rules for predicting class labels. We use 10 datasets from the UCI repository to evaluate the performance of the proposed model. Our ensemble approach CFAR eliminates the limitation of high memory requirement and runtime of recent associative classifiers if training datasets have large feature vectors. Among the datasets we used, along with increasing accuracy in most cases, CFAR removes the noisy rules which enhances the interpretability of the model.\",\"PeriodicalId\":42971,\"journal\":{\"name\":\"Applied Computing Review\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.4000,\"publicationDate\":\"2023-03-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Computing Review\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3555776.3577848\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing Review","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3555776.3577848","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

摘要

在过去的二十年中,关联分类器在预测类标签的任务中表现出了竞争力。联想分类器在提高准确率的同时,还能生成人类可读的预测规则,这对理解分类器的决策过程非常有帮助。早期的关联分类器受到限制,需要适当的阈值设置,这是特定于数据集的。最近一些研究通过产生具有统计意义的规则消除了这一限制。尽管最近的模型与最先进的分类器表现出非常有竞争力的性能,但如果训练数据的特征向量非常大,它们的性能仍然会受到影响。集成模型可以通过使用特征向量的子集来训练每个基学习器来解决这个问题。在这项研究中,我们提出了一种基于频繁关联规则的集成模型分类(CFAR),使用关联分类器作为基础学习器。在我们的方法中,我们没有使用经典的集成和投票方法,而是基于基础学习器中的优势对生成的规则进行排序,并选择规则的一个子集来预测类标签。我们使用来自UCI存储库的10个数据集来评估所提出模型的性能。我们的集成方法CFAR消除了当前关联分类器在训练数据集具有较大特征向量时对内存和运行时间要求较高的限制。在我们使用的数据集中,随着大多数情况下准确性的提高,CFAR去除了噪声规则,从而增强了模型的可解释性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Classification by Frequent Association Rules
Over the last two decades, Associative Classifiers have shown competitive performance in the task of predicting class labels. Along with the performance in accuracy, associative classifiers produce human-readable predictive rules which is very helpful to understand the decision process of the classifiers. Associative classifiers from early days suffer from the limitation requiring proper threshold value setting which is dataset-specific. Recently some studies eliminated that limitation by producing statistically significant rules. Though recent models showed very competitive performance with state-of-the-art classifiers, their performance is still impacted if the feature vector of the training data is very large. An ensemble model can solve this issue by training each base learner with a subset of the feature vector. In this study, we propose an ensemble model Classification by Frequent Association Rules (CFAR) using associative classifiers as base learners. In our approach, instead of using a classical ensemble and a voting method, we rank the generated rules based on predominance among base learners and select a subset of the rules for predicting class labels. We use 10 datasets from the UCI repository to evaluate the performance of the proposed model. Our ensemble approach CFAR eliminates the limitation of high memory requirement and runtime of recent associative classifiers if training datasets have large feature vectors. Among the datasets we used, along with increasing accuracy in most cases, CFAR removes the noisy rules which enhances the interpretability of the model.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Applied Computing Review
Applied Computing Review COMPUTER SCIENCE, INFORMATION SYSTEMS-
自引率
40.00%
发文量
8
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信