将约束推入数据流

BigMine '13 Pub Date : 2013-08-11 DOI:10.1145/2501221.2501232

Andreia Silva, C. Antunes

{"title":"将约束推入数据流","authors":"Andreia Silva, C. Antunes","doi":"10.1145/2501221.2501232","DOIUrl":null,"url":null,"abstract":"One important challenge in data mining is the ability to deal with complex, voluminous and dynamic data. Indeed, due to the great advances in technology, in many real world applications data appear in the form of continuous data streams, as opposed to traditional static datasets. Several techniques have been proposed to explore data streams, in particular for the discovery of frequent co-occurrences in data. However, one of the common criticisms pointed out to frequent pattern mining is the fact that it generates a huge number of patterns, independent of user expertise, making it very hard to analyze and use the results. These bottlenecks are even more evident when dealing with data streams, since new data are continuously and endlessly arriving, and many intermediate results must be kept in memory. The use of constraints to filter the results is the most common and used approach to focus the discovery on what is really interesting. In this sense, there is a need for the integration of data stream mining with constrained mining. In this work we describe a set of strategies for pushing constraints into data stream mining, through the use of a pattern tree structure that captures a summary of the current possible patterns. We also propose an algorithm that discovers patterns in data streams that satisfy any user defined constraint.","PeriodicalId":441216,"journal":{"name":"BigMine '13","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Pushing constraints into data streams\",\"authors\":\"Andreia Silva, C. Antunes\",\"doi\":\"10.1145/2501221.2501232\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"One important challenge in data mining is the ability to deal with complex, voluminous and dynamic data. Indeed, due to the great advances in technology, in many real world applications data appear in the form of continuous data streams, as opposed to traditional static datasets. Several techniques have been proposed to explore data streams, in particular for the discovery of frequent co-occurrences in data. However, one of the common criticisms pointed out to frequent pattern mining is the fact that it generates a huge number of patterns, independent of user expertise, making it very hard to analyze and use the results. These bottlenecks are even more evident when dealing with data streams, since new data are continuously and endlessly arriving, and many intermediate results must be kept in memory. The use of constraints to filter the results is the most common and used approach to focus the discovery on what is really interesting. In this sense, there is a need for the integration of data stream mining with constrained mining. In this work we describe a set of strategies for pushing constraints into data stream mining, through the use of a pattern tree structure that captures a summary of the current possible patterns. We also propose an algorithm that discovers patterns in data streams that satisfy any user defined constraint.\",\"PeriodicalId\":441216,\"journal\":{\"name\":\"BigMine '13\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-08-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BigMine '13\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2501221.2501232\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BigMine '13","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2501221.2501232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

数据挖掘的一个重要挑战是处理复杂、大量和动态数据的能力。事实上，由于技术的巨大进步，在许多现实世界的应用程序中，数据以连续数据流的形式出现，而不是传统的静态数据集。已经提出了几种技术来探索数据流，特别是发现数据中频繁的共现现象。然而，对频繁模式挖掘的一个常见批评是，它生成了大量的模式，独立于用户的专业知识，这使得分析和使用结果变得非常困难。在处理数据流时，这些瓶颈甚至更加明显，因为新数据不断地到达，并且许多中间结果必须保存在内存中。使用约束来过滤结果是将发现重点放在真正有趣的内容上的最常见和常用的方法。从这个意义上说，有必要将数据流挖掘与约束挖掘相结合。在这项工作中，我们描述了一组将约束推入数据流挖掘的策略，通过使用捕获当前可能模式摘要的模式树结构。我们还提出了一种算法来发现数据流中满足任何用户定义约束的模式。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Pushing constraints into data streams

One important challenge in data mining is the ability to deal with complex, voluminous and dynamic data. Indeed, due to the great advances in technology, in many real world applications data appear in the form of continuous data streams, as opposed to traditional static datasets. Several techniques have been proposed to explore data streams, in particular for the discovery of frequent co-occurrences in data. However, one of the common criticisms pointed out to frequent pattern mining is the fact that it generates a huge number of patterns, independent of user expertise, making it very hard to analyze and use the results. These bottlenecks are even more evident when dealing with data streams, since new data are continuously and endlessly arriving, and many intermediate results must be kept in memory. The use of constraints to filter the results is the most common and used approach to focus the discovery on what is really interesting. In this sense, there is a need for the integration of data stream mining with constrained mining. In this work we describe a set of strategies for pushing constraints into data stream mining, through the use of a pattern tree structure that captures a summary of the current possible patterns. We also propose an algorithm that discovers patterns in data streams that satisfy any user defined constraint.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

BigMine '13

自引率

0.00%

发文量