Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics最新文献

Augmenting MATLAB with semantic objects for an interactive visual environment 增强MATLAB与语义对象的交互式视觉环境

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501521

C. Lee, J. Choo, Duen Horng Chau, Haesun Park

引用次数: 1

Towards anytime active learning: interrupting experts to reduce annotation costs 随时主动学习:打断专家，降低注释成本

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501524

M. E. Ramirez-Loaiza, A. Culotta, M. Bilgic

引用次数: 7

Storygraph: extracting patterns from spatio-temporal data 故事图:从时空数据中提取模式

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501525

Ayush Shrestha, B. Miller, Ying Zhu, Yi Zhao

引用次数: 22

Lytic: synthesizing high-dimensional algorithmic analysis with domain-agnostic, faceted visual analytics 分析:综合高维算法分析与领域不可知论，面可视化分析

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501518

Edward Clarkson, J. Choo, John Turgeson, R. Decuir, Haesun Park

引用次数: 2

Zips: mining compressing sequential patterns in streams 压缩:挖掘压缩流中的顺序模式

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501520

Hoang Thanh Lam, T. Calders, Jie Yang, F. Mörchen, Dmitriy Fradkin

{"title":"Zips: mining compressing sequential patterns in streams","authors":"Hoang Thanh Lam, T. Calders, Jie Yang, F. Mörchen, Dmitriy Fradkin","doi":"10.1145/2501511.2501520","DOIUrl":"https://doi.org/10.1145/2501511.2501520","url":null,"abstract":"We propose a streaming algorithm, based on the minimal description length (MDL) principle, for extracting non-redundant sequential patterns. For static databases, the MDL-based approach that selects patterns based on their capacity to compress data rather than their frequency, was shown to be remarkably effective for extracting meaningful patterns and solving the redundancy issue in frequent itemset and sequence mining. The existing MDL-based algorithms, however, either start from a seed set of frequent patterns, or require multiple passes through the data. As such, the existing approaches scale poorly and are unsuitable for large datasets. Therefore, our main contribution is the proposal of a new, streaming algorithm, called Zips, that does not require a seed set of patterns and requires only one scan over the data. For Zips, we extended the Lempel-Ziv (LZ) compression algorithm in three ways: first, whereas LZ assigns codes uniformly as it builds up its dictionary while scanning the input, Zips assigns codewords according to the usage of the dictionary words; more heaviliy used words get shorter code-lengths. Secondly, Zips exploits also non-consecutive occurences of dictionary words for compression. And, third, the well-known space-saving algorithm is used to evict unpromising words from the dictionary. Experiments on one synthetic and two real-world large-scale datasets show that our approach extracts meaningful compressing patterns with similar quality to the state-of-the-art multi-pass algorithms proposed for static databases of sequences. Moreover, our approach scales linearly with the size of data streams while all the existing algorithms do not.","PeriodicalId":126062,"journal":{"name":"Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123513297","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Building blocks for exploratory data analysis tools 探索性数据分析工具的构建块

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501515

S. Alspaugh, Marti A. Hearst, A. Ganapathi, R. Katz

引用次数: 7

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics ACM SIGKDD交互式数据探索和分析研讨会论文集

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511

Duen Horng Chau, Jilles Vreeken, M. Leeuwen, C. Faloutsos

{"title":"Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics","authors":"Duen Horng Chau, Jilles Vreeken, M. Leeuwen, C. Faloutsos","doi":"10.1145/2501511","DOIUrl":"https://doi.org/10.1145/2501511","url":null,"abstract":"We have entered the era of big data. Massive datasets, surpassing terabytes and petabytes in size are now commonplace. They arise in numerous settings in science, government, and enterprises, and technology exists by which we can collect and store such massive amounts of information. Yet, making sense of these data remains a fundamental challenge. We lack the means to exploratively analyze databases of this scale. Currently, few technologies allow us to freely \"wander\" around the data, and make discoveries by following our intuition, or serendipity. While standard data mining aims at finding highly interesting results, it is typically computationally demanding and time consuming, thus may not be well-suited for interactive exploration of large datasets. \u0000 \u0000Interactive data mining techniques that aptly integrate human intuition, by means of visualization and intuitive human-computer interaction techniques, and machine computation support have been shown to help people gain significant insights into a wide range of problems. However, as datasets are being generated in larger volumes, higher velocity, and greater variety, creating effective interactive data mining techniques becomes a much harder task.","PeriodicalId":126062,"journal":{"name":"Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132283579","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

One click mining: interactive local pattern discovery through implicit preference and performance learning 一键挖掘:通过隐式偏好和性能学习进行交互式本地模式发现

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501517

Mario Boley, M. Mampaey, Bo Kang, P. Tokmakov, S. Wrobel

{"title":"One click mining: interactive local pattern discovery through implicit preference and performance learning","authors":"Mario Boley, M. Mampaey, Bo Kang, P. Tokmakov, S. Wrobel","doi":"10.1145/2501511.2501517","DOIUrl":"https://doi.org/10.1145/2501511.2501517","url":null,"abstract":"It is known that productive pattern discovery from data has to interactively involve the user as directly as possible. State-of-the-art toolboxes require the specification of sophisticated workflows with an explicit selection of a data mining method, all its required parameters, and a corresponding algorithm. This hinders the desired rapid interaction---especially with users that are experts of the data domain rather than data mining experts. In this paper, we present a fundamentally new approach towards user involvement that relies exclusively on the implicit feedback available from the natural analysis behavior of the user, and at the same time allows the user to work with a multitude of pattern classes and discovery algorithms simultaneously without even knowing the details of each algorithm. To achieve this goal, we are relying on a recently proposed co-active learning model and a special feature representation of patterns to arrive at an adaptively tuned user interestingness model. At the same time, we propose an adaptive time-allocation strategy to distribute computation time among a set of underlying mining algorithms. We describe the technical details of our approach, present the user interface for gathering implicit feedback, and provide preliminary evaluation results.","PeriodicalId":126062,"journal":{"name":"Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-08-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129539432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 67

Randomly sampling maximal itemsets 随机抽样最大项目集

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501523

Sandy Moens, Bart Goethals

引用次数: 23

Methods for exploring and mining tables on Wikipedia 在维基百科上探索和挖掘表格的方法

Proceedings of the ACM SIGKDD Workshop on Interactive Data Exploration and Analytics Pub Date : 2013-08-11 DOI: 10.1145/2501511.2501516

Chandra Bhagavatula, Thanapon Noraset, Doug Downey

引用次数: 92