基于so集的频繁项集快速挖掘新算法

2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT) Pub Date : 2016-08-01 DOI:10.1109/ICEICT.2016.7879713

Long Tan, Q. Qin

{"title":"基于so集的频繁项集快速挖掘新算法","authors":"Long Tan, Q. Qin","doi":"10.1109/ICEICT.2016.7879713","DOIUrl":null,"url":null,"abstract":"N-list and B-list have simply been proven to be highly effective for mining frequent itemsets. The main problem of the two novel structures is that they both need to encode each node of pre-order (or start order) and post-order (or finish order) code. This causes excessive memory consumption to mine frequent itemsets. In this paper, we propose SO-Sets based on SO-Tree, a more efficient data structure, to mine frequent itemsets. SO-Sets require only start-order (or finish-order) of each node, which makes it save lots of memory compared with N-list and B-list. Based on SO-Sets, we propose a new algorithm called FISO to mining frequent itemsets. To analyze the performance of algorithms, we conduct lots of experiments on five real datasets. Experimental results show that FISO algorithm has advantages in running time and size of main memory consumption.","PeriodicalId":224387,"journal":{"name":"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)","volume":"276 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A new algorithm for fast mining frequent itemsets based on SO-Sets\",\"authors\":\"Long Tan, Q. Qin\",\"doi\":\"10.1109/ICEICT.2016.7879713\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"N-list and B-list have simply been proven to be highly effective for mining frequent itemsets. The main problem of the two novel structures is that they both need to encode each node of pre-order (or start order) and post-order (or finish order) code. This causes excessive memory consumption to mine frequent itemsets. In this paper, we propose SO-Sets based on SO-Tree, a more efficient data structure, to mine frequent itemsets. SO-Sets require only start-order (or finish-order) of each node, which makes it save lots of memory compared with N-list and B-list. Based on SO-Sets, we propose a new algorithm called FISO to mining frequent itemsets. To analyze the performance of algorithms, we conduct lots of experiments on five real datasets. Experimental results show that FISO algorithm has advantages in running time and size of main memory consumption.\",\"PeriodicalId\":224387,\"journal\":{\"name\":\"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)\",\"volume\":\"276 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICEICT.2016.7879713\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEICT.2016.7879713","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

N-list和B-list已经被证明对于挖掘频繁项集是非常有效的。这两种新结构的主要问题是，它们都需要对每个节点的前序(或开始顺序)和后序(或完成顺序)代码进行编码。这会导致过度的内存消耗来挖掘频繁的项集。本文提出了基于so树的so集，一种更有效的数据结构来挖掘频繁项集。SO-Sets只要求每个节点的开始顺序(或结束顺序)，这使得它比N-list和B-list节省了大量的内存。在so集的基础上，提出了一种新的频繁项集挖掘算法FISO。为了分析算法的性能，我们在5个真实数据集上进行了大量的实验。实验结果表明，FISO算法在运行时间和主存消耗大小方面具有优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A new algorithm for fast mining frequent itemsets based on SO-Sets

N-list and B-list have simply been proven to be highly effective for mining frequent itemsets. The main problem of the two novel structures is that they both need to encode each node of pre-order (or start order) and post-order (or finish order) code. This causes excessive memory consumption to mine frequent itemsets. In this paper, we propose SO-Sets based on SO-Tree, a more efficient data structure, to mine frequent itemsets. SO-Sets require only start-order (or finish-order) of each node, which makes it save lots of memory compared with N-list and B-list. Based on SO-Sets, we propose a new algorithm called FISO to mining frequent itemsets. To analyze the performance of algorithms, we conduct lots of experiments on five real datasets. Experimental results show that FISO algorithm has advantages in running time and size of main memory consumption.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2016 IEEE International Conference on Electronic Information and Communication Technology (ICEICT)

自引率

0.00%

发文量