Toward a Distinguishing Approach for Improving the Apriori Algorithm

Mahdieh Dehghani, A. Kamandi, M. Shabankhah, A. Moeini
{"title":"Toward a Distinguishing Approach for Improving the Apriori Algorithm","authors":"Mahdieh Dehghani, A. Kamandi, M. Shabankhah, A. Moeini","doi":"10.1109/ICCKE48569.2019.8965206","DOIUrl":null,"url":null,"abstract":"Association rule mining, one of the most important branches of data mining, which focused on detecting frequent patterns of itemsets. Apriori is the first algorithm proposed for association rule mining. This algorithm has the best response and can detect all frequent itemsets from transaction databases. Apriori is of time complexity order two to the power n at worst case, n is the number of items in the database. At each step, the database is scanned to detect frequent itemsets. As a result, this algorithm has a very large response time for large databases. There are two ways to reduce the response time of this algorithm. First, prune the itemsets which candidate for checking. Second, reduce the dimension of the database. We used the second solution and reduce the dimension of the database considering that if a set is frequent, all of its subsets are frequent with more frequencies in the database. In the proposed algorithm, database scanned one time, and then frequent itemsets are detected by the reduced database. Our algorithm improved an apriori response time. To evaluate the algorithm, precision and recall measures have been used. According to the experimental in most cases, the algorithm can provide precision and recall above ninety percent.","PeriodicalId":6685,"journal":{"name":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"48 1","pages":"309-314"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 9th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE48569.2019.8965206","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Association rule mining, one of the most important branches of data mining, which focused on detecting frequent patterns of itemsets. Apriori is the first algorithm proposed for association rule mining. This algorithm has the best response and can detect all frequent itemsets from transaction databases. Apriori is of time complexity order two to the power n at worst case, n is the number of items in the database. At each step, the database is scanned to detect frequent itemsets. As a result, this algorithm has a very large response time for large databases. There are two ways to reduce the response time of this algorithm. First, prune the itemsets which candidate for checking. Second, reduce the dimension of the database. We used the second solution and reduce the dimension of the database considering that if a set is frequent, all of its subsets are frequent with more frequencies in the database. In the proposed algorithm, database scanned one time, and then frequent itemsets are detected by the reduced database. Our algorithm improved an apriori response time. To evaluate the algorithm, precision and recall measures have been used. According to the experimental in most cases, the algorithm can provide precision and recall above ninety percent.
一种改进Apriori算法的判别方法
关联规则挖掘是数据挖掘的一个重要分支,其重点是检测项目集的频繁模式。Apriori是最早提出的关联规则挖掘算法。该算法具有最佳的响应性,能够检测到事务数据库中所有的频繁项集。Apriori的时间复杂度为(2 ^ n)在最坏的情况下,n是数据库中项目的数量。在每一步中,都会扫描数据库以检测频繁的项集。因此,对于大型数据库,该算法的响应时间非常长。有两种方法可以减少该算法的响应时间。首先,删减要检查的候选项集。其次,降低数据库的维数。我们使用第二种解决方案,考虑到如果一个集合是频繁的,那么它的所有子集都是频繁的,并且在数据库中频率更高,因此降低了数据库的维数。该算法首先对数据库进行一次扫描,然后通过简化后的数据库检测出频繁项集。我们的算法改进了先验响应时间。为了评估该算法,使用了精度和召回率度量。实验表明,在大多数情况下,该算法的查准率和查全率都在90%以上。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信