在数据流上的滑动窗口中维护最近出现的项集的近似方法

Jia-Ling Koh, Shu-Ning Shin, Yuan-Bin Don
{"title":"在数据流上的滑动窗口中维护最近出现的项集的近似方法","authors":"Jia-Ling Koh, Shu-Ning Shin, Yuan-Bin Don","doi":"10.4018/978-1-60566-748-5.CH014","DOIUrl":null,"url":null,"abstract":"Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets over data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of patterns within a sliding window has to be maintained completely in the traditional approach. For estimating the approximate supports of patterns within a sliding window, the frequency changing point (FCP) method is proposed for monitoring the recent occurrences of itemsets over a data stream. In addition to a basic design proposed under the assumption that exact one transaction arrives at each time point, the FCP method is extended for maintaining recent patterns over a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within a fixed time unit. Accordingly, the recently frequent itemsets or representative patterns are discovered from the maintained structure approximately. Experimental studies demonstrate that the proposed algorithms achieve high true positive rates and guarantees no false dismissal to the results yielded. A theoretic analysis is provided for the guarantee. In addition, the authors’ approach outperforms the previously proposed method in terms of reducing the run-time memory usage significantly. DOI: 10.4018/978-1-60566-748-5.ch014","PeriodicalId":255230,"journal":{"name":"Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Approximate Approach for Maintaining Recent Occurrences of Itemsets in a Sliding Window over Data Streams\",\"authors\":\"Jia-Ling Koh, Shu-Ning Shin, Yuan-Bin Don\",\"doi\":\"10.4018/978-1-60566-748-5.CH014\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets over data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of patterns within a sliding window has to be maintained completely in the traditional approach. For estimating the approximate supports of patterns within a sliding window, the frequency changing point (FCP) method is proposed for monitoring the recent occurrences of itemsets over a data stream. In addition to a basic design proposed under the assumption that exact one transaction arrives at each time point, the FCP method is extended for maintaining recent patterns over a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within a fixed time unit. Accordingly, the recently frequent itemsets or representative patterns are discovered from the maintained structure approximately. Experimental studies demonstrate that the proposed algorithms achieve high true positive rates and guarantees no false dismissal to the results yielded. A theoretic analysis is provided for the guarantee. In addition, the authors’ approach outperforms the previously proposed method in terms of reducing the run-time memory usage significantly. DOI: 10.4018/978-1-60566-748-5.ch014\",\"PeriodicalId\":255230,\"journal\":{\"name\":\"Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development\",\"volume\":\"12 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/978-1-60566-748-5.CH014\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/978-1-60566-748-5.CH014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

近年来,数据流作为一种快速生成的无界数据元素序列,为数据源的收集提供了动态环境。数据流中嵌入的知识很可能会随着时间的推移而迅速变化。因此,在数据流上挖掘频繁项集时,捕捉数据的最新趋势是一个重要的问题。虽然滑动窗口模型很好地解决了这一问题,但传统方法必须完整地保持滑动窗口内模式的出现信息。为了估计滑动窗口内模式的近似支持度,提出了频率变化点(FCP)方法来监测数据流上最近出现的项集。除了在每个时间点只到达一个事务的假设下提出的基本设计之外,FCP方法还被扩展为在数据流上维护最近的模式,其中在固定的时间单位内输入了不同数量的事务块(包括零个或多个事务)。相应地,从所维护的结构中近似地发现最近频繁出现的项集或代表性模式。实验研究表明,该算法获得了较高的真阳性率,并保证了结果不会被误解雇。并对其进行了理论分析。此外,作者的方法在显著减少运行时内存使用方面优于先前提出的方法。DOI: 10.4018 / 978 - 1 - 60566 - 748 - 5. - ch014
本文章由计算机程序翻译,如有差异,请以英文原文为准。
An Approximate Approach for Maintaining Recent Occurrences of Itemsets in a Sliding Window over Data Streams
Recently, the data stream, which is an unbounded sequence of data elements generated at a rapid rate, provides a dynamic environment for collecting data sources. It is likely that the embedded knowledge in a data stream will change quickly as time goes by. Therefore, catching the recent trend of data is an important issue when mining frequent itemsets over data streams. Although the sliding window model proposed a good solution for this problem, the appearing information of patterns within a sliding window has to be maintained completely in the traditional approach. For estimating the approximate supports of patterns within a sliding window, the frequency changing point (FCP) method is proposed for monitoring the recent occurrences of itemsets over a data stream. In addition to a basic design proposed under the assumption that exact one transaction arrives at each time point, the FCP method is extended for maintaining recent patterns over a data stream where a block of various numbers of transactions (including zero or more transactions) is inputted within a fixed time unit. Accordingly, the recently frequent itemsets or representative patterns are discovered from the maintained structure approximately. Experimental studies demonstrate that the proposed algorithms achieve high true positive rates and guarantees no false dismissal to the results yielded. A theoretic analysis is provided for the guarantee. In addition, the authors’ approach outperforms the previously proposed method in terms of reducing the run-time memory usage significantly. DOI: 10.4018/978-1-60566-748-5.ch014
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信