{"title":"分布式数据流上全局模式挖掘的频繁项集约简算法","authors":"Shalini, Sanjay Kumar Jain","doi":"10.1109/IC3.2017.8284320","DOIUrl":null,"url":null,"abstract":"In present scenario, extracting global frequent itemsets from big data, distributed across multiple data streams, with its real time requirements is a complex problem. In this article, we propose an algorithm that reduces number of local frequent itemsets communicated to root node to extract global patterns from distributed multiple data streams. Here, the algorithm sends only local frequent itemsets to the root node instead of sending summary of local data streams. We compress sets of local frequent itemsets and send them to the root node using algorithm called Frequent Itemset Reduction (FIR) algorithm. We present two indexing structures known as I-list and Modified Seg-tree (MsegT) to store all local frequent itemsets at root node. Our experimental study exhibits that the FIR algorithm reduces communication cost in a good extent and MsegT produces substantial good results compared to I-list and few state-of-the-art techniques.","PeriodicalId":147099,"journal":{"name":"2017 Tenth International Conference on Contemporary Computing (IC3)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A frequent itemset reduction algorithm for global pattern mining on distributed data streams\",\"authors\":\"Shalini, Sanjay Kumar Jain\",\"doi\":\"10.1109/IC3.2017.8284320\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In present scenario, extracting global frequent itemsets from big data, distributed across multiple data streams, with its real time requirements is a complex problem. In this article, we propose an algorithm that reduces number of local frequent itemsets communicated to root node to extract global patterns from distributed multiple data streams. Here, the algorithm sends only local frequent itemsets to the root node instead of sending summary of local data streams. We compress sets of local frequent itemsets and send them to the root node using algorithm called Frequent Itemset Reduction (FIR) algorithm. We present two indexing structures known as I-list and Modified Seg-tree (MsegT) to store all local frequent itemsets at root node. Our experimental study exhibits that the FIR algorithm reduces communication cost in a good extent and MsegT produces substantial good results compared to I-list and few state-of-the-art techniques.\",\"PeriodicalId\":147099,\"journal\":{\"name\":\"2017 Tenth International Conference on Contemporary Computing (IC3)\",\"volume\":\"45 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 Tenth International Conference on Contemporary Computing (IC3)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IC3.2017.8284320\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 Tenth International Conference on Contemporary Computing (IC3)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC3.2017.8284320","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A frequent itemset reduction algorithm for global pattern mining on distributed data streams
In present scenario, extracting global frequent itemsets from big data, distributed across multiple data streams, with its real time requirements is a complex problem. In this article, we propose an algorithm that reduces number of local frequent itemsets communicated to root node to extract global patterns from distributed multiple data streams. Here, the algorithm sends only local frequent itemsets to the root node instead of sending summary of local data streams. We compress sets of local frequent itemsets and send them to the root node using algorithm called Frequent Itemset Reduction (FIR) algorithm. We present two indexing structures known as I-list and Modified Seg-tree (MsegT) to store all local frequent itemsets at root node. Our experimental study exhibits that the FIR algorithm reduces communication cost in a good extent and MsegT produces substantial good results compared to I-list and few state-of-the-art techniques.