2008 Eighth IEEE International Conference on Data Mining最新文献_第6页

Anti-monotonic Overlap-Graph Support Measures 反单调重叠图的支持措施

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.114

T. Calders, J. Ramon, D. V. Dyck

引用次数: 23

Stream Sequential Pattern Mining with Precise Error Bounds 具有精确错误边界的流顺序模式挖掘

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.154

L. F. Mendes, Bolin Ding, Jiawei Han

{"title":"Stream Sequential Pattern Mining with Precise Error Bounds","authors":"L. F. Mendes, Bolin Ding, Jiawei Han","doi":"10.1109/ICDM.2008.154","DOIUrl":"https://doi.org/10.1109/ICDM.2008.154","url":null,"abstract":"Sequential pattern mining is an interesting data mining problem with many real-world applications. This problem has been studied extensively in static databases. However, in recent years, emerging applications have introduced a new form of data called data stream. In a data stream, new elements are generated continuously. This poses additional constraints on the methods used for mining such data: memory usage is restricted, the infinitely flowing original dataset cannot be scanned multiple times, and current results should be available on demand.This paper introduces two effective methods for mining sequential patterns from data streams: the SS-BE method and the SS-MB method. The proposed methods break the stream into batches and only process each batch once. The two methods use different pruning strategies that restrict the memory usage but can still guarantee that all true sequential patterns are output at the end of any batch. Both algorithms scale linearly in execution time as the number of sequences grows, making them effective methods for sequential pattern mining in data streams. The experimental results also show that our methods are very accurate in that only a small fraction of the patterns that are output are false positives. Even for these false positives, SS-BE guarantees that their true support is above a pre-defined threshold.","PeriodicalId":252958,"journal":{"name":"2008 Eighth IEEE International Conference on Data Mining","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126282763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 66

Scalable Tensor Decompositions for Multi-aspect Data Mining 面向多方面数据挖掘的可伸缩张量分解

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.89

T. Kolda, Jimeng Sun

{"title":"Scalable Tensor Decompositions for Multi-aspect Data Mining","authors":"T. Kolda, Jimeng Sun","doi":"10.1109/ICDM.2008.89","DOIUrl":"https://doi.org/10.1109/ICDM.2008.89","url":null,"abstract":"Modern applications such as Internet traffic, telecommunication records, and large-scale social networks generate massive amounts of data with multiple aspects and high dimensionalities. Tensors (i.e., multi-way arrays) provide a natural representation for such data. Consequently, tensor decompositions such as Tucker become important tools for summarization and analysis. One major challenge is how to deal with high-dimensional, sparse data. In other words, how do we compute decompositions of tensors where most of the entries of the tensor are zero. Specialized techniques are needed for computing the Tucker decompositions for sparse tensors because standard algorithms do not account for the sparsity of the data. As a result, a surprising phenomenon is observed by practitioners: Despite the fact that there is enough memory to store both the input tensors and the factorized output tensors, memory overflows occur during the tensor factorization process. To address this intermediate blowup problem, we propose Memory-Efficient Tucker (MET). Based on the available memory, MET adaptively selects the right execution strategy during the decomposition. We provide quantitative and qualitative evaluation of MET on real tensors. It achieves over 1000X space reduction without sacrificing speed; it also allows us to work with much larger tensors that were too big to handle before. Finally, we demonstrate a data mining case-study using MET.","PeriodicalId":252958,"journal":{"name":"2008 Eighth IEEE International Conference on Data Mining","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128932387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 371

WiFIsViz: Effective Visualization of Frequent Itemsets wiisviz:频繁项目集的有效可视化

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.93

C. Leung, Pourang Irani, Christopher L. Carmichael

引用次数: 55

Learning the Latent Semantic Space for Ranking in Text Retrieval 基于潜在语义空间的文本检索排序学习

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.68

Jun Yan, Shuicheng Yan, Ning Liu, Zheng Chen

引用次数: 1

Comparative Evaluation of Anomaly Detection Techniques for Sequence Data 序列数据异常检测技术的比较评价

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.151

V. Chandola, Varun Mithal, Vipin Kumar

引用次数: 165

Mining Large Networks with Subgraph Counting 利用子图计数挖掘大型网络

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.109

Ilaria Bordino, D. Donato, A. Gionis, S. Leonardi

引用次数: 75

A Probability Model for Projective Clustering on High Dimensional Data 高维数据投影聚类的概率模型

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.15

Lifei Chen, Q. Jiang, Shengrui Wang

引用次数: 20

What Sperner Family Concept Class is Easy to Be Enumerated? 哪些斯宾纳家庭概念类易于枚举?

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.131

Atsuyoshi Nakamura, Mineichi Kudo

引用次数: 3

Spotting Significant Changing Subgraphs in Evolving Graphs 发现进化图中显著变化的子图

2008 Eighth IEEE International Conference on Data Mining Pub Date : 2008-12-15 DOI: 10.1109/ICDM.2008.112

Zheng Liu, J. Yu, Yiping Ke, Xuemin Lin, Lei Chen

引用次数: 37