StreamKDD '10最新文献

Fully decentralized computation of aggregates over data streams 对数据流的聚合进行完全分散的计算

StreamKDD '10 Pub Date : 2011-03-31 DOI: 10.1145/1833280.1833281

L. Becchetti, Ilaria Bordino, S. Leonardi, A. Rosén

{"title":"Fully decentralized computation of aggregates over data streams","authors":"L. Becchetti, Ilaria Bordino, S. Leonardi, A. Rosén","doi":"10.1145/1833280.1833281","DOIUrl":"https://doi.org/10.1145/1833280.1833281","url":null,"abstract":"In several emerging applications, data is collected in massive streams at several distributed points of observation. A basic and challenging task is to allow every node to monitor a neighbourhood of interest by issuing continuous aggregate queries on the streams observed in its vicinity. This class of algorithms is fully decentralized and diffusive in nature: collecting all data at few central nodes of the network is unfeasible in networks of low capability devices or in the presence of massive data sets.\u0000 The main difficulty in designing diffusive algorithms is to cope with duplicate detections. These arise both from the observation of the same event at several nodes of the network and/or receipt of the same aggregated information along multiple paths of diffusion.\u0000 In this paper, we consider fully decentralized algorithms that answer locally continuous aggregate queries on the number of distinct events, total number of events and the second frequency moment in the scenario outlined above. The proposed algorithms use in the worst case or on realistic distributions sublinear space at every node.\u0000 We also propose strategies that minimize the communication needed to update the aggregates when new events are observed. We finally present experimental analysis providing evidence for the efficiency and accuracy of our algorithms on realistic simulated scenarios.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131661872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Evolutionary clustering using frequent itemsets 使用频繁项集的进化聚类

StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833284

R. Shankar, G. V. Kiran, Vikram Pudi

引用次数: 15

CALDS: context-aware learning from data streams CALDS:从数据流中进行上下文感知学习

StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833283

J. Gomes, Ernestina Menasalvas Ruiz, Pedro A. C. Sousa

{"title":"CALDS: context-aware learning from data streams","authors":"J. Gomes, Ernestina Menasalvas Ruiz, Pedro A. C. Sousa","doi":"10.1145/1833280.1833283","DOIUrl":"https://doi.org/10.1145/1833280.1833283","url":null,"abstract":"Drift detection methods in data streams can detect changes in incoming data so that learned models can be used to represent the underlying population. In many real-world scenarios context information is available and could be exploited to improve existing approaches, by detecting or even anticipating to recurring concepts in the underlying population. Several applications, among them health-care or recommender systems, lend themselves to use such information as data from sensors is available but is not being used. Nevertheless, new challenges arise when integrating context with drift detection methods. Modeling and comparing context information, representing the context-concepts history and storing previously learned concepts for reuse are some of the critical problems. In this work, we propose the Context-aware Learning from Data Streams (CALDS) system to improve existing drift detection methods by exploiting available context information. Our enhancement is seamless: we use the association between context information and learned concepts to improve detection and adaptation to drift when concepts reappear. We present and discuss our preliminary experimental results with synthetic and real datasets.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114566280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

Towards subspace clustering on dynamic data: an incremental version of PreDeCon 动态数据上的子空间聚类:一个增量版本的前辈

StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833285

H. Kriegel, Peer Kröger, Eirini Ntoutsi, A. Zimek

引用次数: 11

Visual analysis of news streams with article threads 可视化分析的新闻流与文章线程

StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833286

Milos Krstajic, E. Bertini, Florian Mansmann, D. Keim

{"title":"Visual analysis of news streams with article threads","authors":"Milos Krstajic, E. Bertini, Florian Mansmann, D. Keim","doi":"10.1145/1833280.1833286","DOIUrl":"https://doi.org/10.1145/1833280.1833286","url":null,"abstract":"The analysis of large quantities of news is an emerging area in the field of data analysis and visualization. International agencies collect thousands of news every day from a large number of sources and making sense of them is becoming increasingly complex due to the rate of the incoming news, as well as the inherent complexity of analyzing large quantities of evolving text corpora. Current visual techniques that deal with temporal evolution of such complex datasets, together with research efforts in related domains like text mining and topic detection and tracking, represent early attempts to understand, gain insight and make sense of these data. Despite these initial propositions, there is still a lack of techniques dealing directly with the problem of visualizing news streams in a \"on-line\" fashion, that is, in a way that the evolution of news can be monitored in real-time by the operator. In this paper we propose a purely visual technique that permits to see the evolution of news in real-time. The technique permits to show the stream of news as they enter into the system as well as a series of important threads which are computed on the fly. By merging single articles into threads, the technique permits to offload the visualization and retain only the most relevant information. The proposed technique is applied to the visualization of news streams generated by a news aggregation system that monitors over 4000 sites from 1600 key news portals world-wide and retrieves over 80000 reports per day in 43 languages.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"239 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124448504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 23

Detecting outliers on arbitrary data streams using anytime approaches 使用任意时间方法检测任意数据流上的异常值

StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833282

I. Assent, P. Kranen, C. Baldauf, T. Seidl

引用次数: 9

Research issues in mining multiple data streams 多数据流挖掘的研究问题

StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833288

Wenyan Wu, L. Gruenwald

引用次数: 27

Conformal prediction for distribution-independent anomaly detection in streaming vessel data 流船数据中分布无关异常检测的保形预测

StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833287

Rikard Laxhammar, G. Falkman

{"title":"Conformal prediction for distribution-independent anomaly detection in streaming vessel data","authors":"Rikard Laxhammar, G. Falkman","doi":"10.1145/1833280.1833287","DOIUrl":"https://doi.org/10.1145/1833280.1833287","url":null,"abstract":"This paper presents a novel application of the theory of conformal prediction for distribution-independent on-line learning and anomaly detection. We exploit the fact that conformal predictors give valid prediction sets at specified confidence levels under the relatively weak assumption that the (normal) training data together with (normal) observations to be predicted have been generated from the same distribution. If the actual observation is not included in the possibly empty prediction set, it is classified as anomalous at the corresponding significance level. Interpreting the significance level as an upper bound of the probability that a normal observation is mistakenly classified as anomalous, we can conveniently adjust the sensitivity to anomalies while controlling the rate of false alarms without having to find any application specific thresholds. The proposed method has been evaluated in the domain of sea surveillance using recorded data assumed to be normal. The validity of the prediction sets is justified by the empirical error rate which is just below the significance level. In addition, experiments with simulated anomalous data indicate that anomaly detection sensitivity is superior to that of two previously proposed methods.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122561956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 44