{"title":"从数据流中挖掘具有密度的频繁闭项集的算法","authors":"Dai Caiyan, Chen Ling","doi":"10.1504/IJCSE.2016.076217","DOIUrl":null,"url":null,"abstract":"Mining frequent closed itemsets from data streams is an important topic. In this paper, we propose an algorithm for mining frequent closed itemsets from data streams based on a time fading module. By dynamically constructing a pattern tree, the algorithm calculates densities of the itemsets in the pattern tree using a fading factor. The algorithm deletes real infrequent itemsets from the pattern tree so as to reduce the memory cost. A density threshold function is designed in order to identify the real infrequent itemsets which should be deleted. Using such density threshold function, deleting the infrequent itemsets will not affect the result of frequent itemset detecting. The algorithm modifies the pattern tree and detects the frequent closed itemsets in a fixed time interval so as to reduce the computation time. We also analyse the error caused by deleting the infrequent itemsets. The experimental results indicate that our algorithm can get higher accuracy results, and needs less memory and computation time than other algorithm.","PeriodicalId":47380,"journal":{"name":"International Journal of Computational Science and Engineering","volume":"5 1","pages":"146-154"},"PeriodicalIF":1.4000,"publicationDate":"2016-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An algorithm for mining frequent closed itemsets with density from data streams\",\"authors\":\"Dai Caiyan, Chen Ling\",\"doi\":\"10.1504/IJCSE.2016.076217\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mining frequent closed itemsets from data streams is an important topic. In this paper, we propose an algorithm for mining frequent closed itemsets from data streams based on a time fading module. By dynamically constructing a pattern tree, the algorithm calculates densities of the itemsets in the pattern tree using a fading factor. The algorithm deletes real infrequent itemsets from the pattern tree so as to reduce the memory cost. A density threshold function is designed in order to identify the real infrequent itemsets which should be deleted. Using such density threshold function, deleting the infrequent itemsets will not affect the result of frequent itemset detecting. The algorithm modifies the pattern tree and detects the frequent closed itemsets in a fixed time interval so as to reduce the computation time. We also analyse the error caused by deleting the infrequent itemsets. The experimental results indicate that our algorithm can get higher accuracy results, and needs less memory and computation time than other algorithm.\",\"PeriodicalId\":47380,\"journal\":{\"name\":\"International Journal of Computational Science and Engineering\",\"volume\":\"5 1\",\"pages\":\"146-154\"},\"PeriodicalIF\":1.4000,\"publicationDate\":\"2016-05-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Computational Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJCSE.2016.076217\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Computational Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJCSE.2016.076217","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
An algorithm for mining frequent closed itemsets with density from data streams
Mining frequent closed itemsets from data streams is an important topic. In this paper, we propose an algorithm for mining frequent closed itemsets from data streams based on a time fading module. By dynamically constructing a pattern tree, the algorithm calculates densities of the itemsets in the pattern tree using a fading factor. The algorithm deletes real infrequent itemsets from the pattern tree so as to reduce the memory cost. A density threshold function is designed in order to identify the real infrequent itemsets which should be deleted. Using such density threshold function, deleting the infrequent itemsets will not affect the result of frequent itemset detecting. The algorithm modifies the pattern tree and detects the frequent closed itemsets in a fixed time interval so as to reduce the computation time. We also analyse the error caused by deleting the infrequent itemsets. The experimental results indicate that our algorithm can get higher accuracy results, and needs less memory and computation time than other algorithm.
期刊介绍:
Computational science and engineering is an emerging and promising discipline in shaping future research and development activities in both academia and industry, in fields ranging from engineering, science, finance, and economics, to arts and humanities. New challenges arise in the modelling of complex systems, sophisticated algorithms, advanced scientific and engineering computing and associated (multidisciplinary) problem-solving environments. Because the solution of large and complex problems must cope with tight timing schedules, powerful algorithms and computational techniques, are inevitable. IJCSE addresses the state of the art of all aspects of computational science and engineering with emphasis on computational methods and techniques for science and engineering applications.