{"title":"A frequent itemset reduction algorithm for global pattern mining on distributed data streams","authors":"Shalini, Sanjay Kumar Jain","doi":"10.1109/IC3.2017.8284320","DOIUrl":null,"url":null,"abstract":"In present scenario, extracting global frequent itemsets from big data, distributed across multiple data streams, with its real time requirements is a complex problem. In this article, we propose an algorithm that reduces number of local frequent itemsets communicated to root node to extract global patterns from distributed multiple data streams. Here, the algorithm sends only local frequent itemsets to the root node instead of sending summary of local data streams. We compress sets of local frequent itemsets and send them to the root node using algorithm called Frequent Itemset Reduction (FIR) algorithm. We present two indexing structures known as I-list and Modified Seg-tree (MsegT) to store all local frequent itemsets at root node. Our experimental study exhibits that the FIR algorithm reduces communication cost in a good extent and MsegT produces substantial good results compared to I-list and few state-of-the-art techniques.","PeriodicalId":147099,"journal":{"name":"2017 Tenth International Conference on Contemporary Computing (IC3)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Tenth International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2017.8284320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
In present scenario, extracting global frequent itemsets from big data, distributed across multiple data streams, with its real time requirements is a complex problem. In this article, we propose an algorithm that reduces number of local frequent itemsets communicated to root node to extract global patterns from distributed multiple data streams. Here, the algorithm sends only local frequent itemsets to the root node instead of sending summary of local data streams. We compress sets of local frequent itemsets and send them to the root node using algorithm called Frequent Itemset Reduction (FIR) algorithm. We present two indexing structures known as I-list and Modified Seg-tree (MsegT) to store all local frequent itemsets at root node. Our experimental study exhibits that the FIR algorithm reduces communication cost in a good extent and MsegT produces substantial good results compared to I-list and few state-of-the-art techniques.