基于so集的频繁项集快速挖掘新算法

Long Tan, Q. Qin
{"title":"基于so集的频繁项集快速挖掘新算法","authors":"Long Tan, Q. Qin","doi":"10.1109/ICEICT.2016.7879713","DOIUrl":null,"url":null,"abstract":"N-list and B-list have simply been proven to be highly effective for mining frequent itemsets. The main problem of the two novel structures is that they both need to encode each node of pre-order (or start order) and post-order (or finish order) code. This causes excessive memory consumption to mine frequent itemsets. In this paper, we propose SO-Sets based on SO-Tree, a more efficient data structure, to mine frequent itemsets. SO-Sets require only start-order (or finish-order) of each node, which makes it save lots of memory compared with N-list and B-list. Based on SO-Sets, we propose a new algorithm called FISO to mining frequent itemsets. To analyze the performance of algorithms, we conduct lots of experiments on five real datasets. Experimental results show that FISO algorithm has advantages in running time and size of main memory consumption.","PeriodicalId":224387,"journal":{"name":"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)","volume":"276 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A new algorithm for fast mining frequent itemsets based on SO-Sets\",\"authors\":\"Long Tan, Q. Qin\",\"doi\":\"10.1109/ICEICT.2016.7879713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"N-list and B-list have simply been proven to be highly effective for mining frequent itemsets. The main problem of the two novel structures is that they both need to encode each node of pre-order (or start order) and post-order (or finish order) code. This causes excessive memory consumption to mine frequent itemsets. In this paper, we propose SO-Sets based on SO-Tree, a more efficient data structure, to mine frequent itemsets. SO-Sets require only start-order (or finish-order) of each node, which makes it save lots of memory compared with N-list and B-list. Based on SO-Sets, we propose a new algorithm called FISO to mining frequent itemsets. To analyze the performance of algorithms, we conduct lots of experiments on five real datasets. Experimental results show that FISO algorithm has advantages in running time and size of main memory consumption.\",\"PeriodicalId\":224387,\"journal\":{\"name\":\"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)\",\"volume\":\"276 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEICT.2016.7879713\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEICT.2016.7879713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

N-list和B-list已经被证明对于挖掘频繁项集是非常有效的。这两种新结构的主要问题是,它们都需要对每个节点的前序(或开始顺序)和后序(或完成顺序)代码进行编码。这会导致过度的内存消耗来挖掘频繁的项集。本文提出了基于so树的so集,一种更有效的数据结构来挖掘频繁项集。SO-Sets只要求每个节点的开始顺序(或结束顺序),这使得它比N-list和B-list节省了大量的内存。在so集的基础上,提出了一种新的频繁项集挖掘算法FISO。为了分析算法的性能,我们在5个真实数据集上进行了大量的实验。实验结果表明,FISO算法在运行时间和主存消耗大小方面具有优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A new algorithm for fast mining frequent itemsets based on SO-Sets
N-list and B-list have simply been proven to be highly effective for mining frequent itemsets. The main problem of the two novel structures is that they both need to encode each node of pre-order (or start order) and post-order (or finish order) code. This causes excessive memory consumption to mine frequent itemsets. In this paper, we propose SO-Sets based on SO-Tree, a more efficient data structure, to mine frequent itemsets. SO-Sets require only start-order (or finish-order) of each node, which makes it save lots of memory compared with N-list and B-list. Based on SO-Sets, we propose a new algorithm called FISO to mining frequent itemsets. To analyze the performance of algorithms, we conduct lots of experiments on five real datasets. Experimental results show that FISO algorithm has advantages in running time and size of main memory consumption.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信