A frequent itemset reduction algorithm for global pattern mining on distributed data streams

Shalini, Sanjay Kumar Jain
{"title":"A frequent itemset reduction algorithm for global pattern mining on distributed data streams","authors":"Shalini, Sanjay Kumar Jain","doi":"10.1109/IC3.2017.8284320","DOIUrl":null,"url":null,"abstract":"In present scenario, extracting global frequent itemsets from big data, distributed across multiple data streams, with its real time requirements is a complex problem. In this article, we propose an algorithm that reduces number of local frequent itemsets communicated to root node to extract global patterns from distributed multiple data streams. Here, the algorithm sends only local frequent itemsets to the root node instead of sending summary of local data streams. We compress sets of local frequent itemsets and send them to the root node using algorithm called Frequent Itemset Reduction (FIR) algorithm. We present two indexing structures known as I-list and Modified Seg-tree (MsegT) to store all local frequent itemsets at root node. Our experimental study exhibits that the FIR algorithm reduces communication cost in a good extent and MsegT produces substantial good results compared to I-list and few state-of-the-art techniques.","PeriodicalId":147099,"journal":{"name":"2017 Tenth International Conference on Contemporary Computing (IC3)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Tenth International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2017.8284320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In present scenario, extracting global frequent itemsets from big data, distributed across multiple data streams, with its real time requirements is a complex problem. In this article, we propose an algorithm that reduces number of local frequent itemsets communicated to root node to extract global patterns from distributed multiple data streams. Here, the algorithm sends only local frequent itemsets to the root node instead of sending summary of local data streams. We compress sets of local frequent itemsets and send them to the root node using algorithm called Frequent Itemset Reduction (FIR) algorithm. We present two indexing structures known as I-list and Modified Seg-tree (MsegT) to store all local frequent itemsets at root node. Our experimental study exhibits that the FIR algorithm reduces communication cost in a good extent and MsegT produces substantial good results compared to I-list and few state-of-the-art techniques.
分布式数据流上全局模式挖掘的频繁项集约简算法
在目前的场景中,从分布在多个数据流中的大数据中提取全局频繁项集是一个非常复杂的问题。在本文中,我们提出了一种减少与根节点通信的局部频繁项集数量的算法,以从分布式多数据流中提取全局模式。这里,算法只向根节点发送本地频繁项集,而不发送本地数据流的摘要。我们使用频繁项集约简(FIR)算法压缩本地频繁项集集并将其发送到根节点。我们提出了I-list和Modified Seg-tree (MsegT)两种索引结构,将所有本地频繁项集存储在根节点。我们的实验研究表明,与I-list和一些最先进的技术相比,FIR算法在很大程度上降低了通信成本,MsegT产生了相当好的结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信