StreamKDD '10最新文献

筛选
英文 中文
Fully decentralized computation of aggregates over data streams 对数据流的聚合进行完全分散的计算
StreamKDD '10 Pub Date : 2011-03-31 DOI: 10.1145/1833280.1833281
L. Becchetti, Ilaria Bordino, S. Leonardi, A. Rosén
{"title":"Fully decentralized computation of aggregates over data streams","authors":"L. Becchetti, Ilaria Bordino, S. Leonardi, A. Rosén","doi":"10.1145/1833280.1833281","DOIUrl":"https://doi.org/10.1145/1833280.1833281","url":null,"abstract":"In several emerging applications, data is collected in massive streams at several distributed points of observation. A basic and challenging task is to allow every node to monitor a neighbourhood of interest by issuing continuous aggregate queries on the streams observed in its vicinity. This class of algorithms is fully decentralized and diffusive in nature: collecting all data at few central nodes of the network is unfeasible in networks of low capability devices or in the presence of massive data sets.\u0000 The main difficulty in designing diffusive algorithms is to cope with duplicate detections. These arise both from the observation of the same event at several nodes of the network and/or receipt of the same aggregated information along multiple paths of diffusion.\u0000 In this paper, we consider fully decentralized algorithms that answer locally continuous aggregate queries on the number of distinct events, total number of events and the second frequency moment in the scenario outlined above. The proposed algorithms use in the worst case or on realistic distributions sublinear space at every node.\u0000 We also propose strategies that minimize the communication needed to update the aggregates when new events are observed. We finally present experimental analysis providing evidence for the efficiency and accuracy of our algorithms on realistic simulated scenarios.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2011-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131661872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Evolutionary clustering using frequent itemsets 使用频繁项集的进化聚类
StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833284
R. Shankar, G. V. Kiran, Vikram Pudi
{"title":"Evolutionary clustering using frequent itemsets","authors":"R. Shankar, G. V. Kiran, Vikram Pudi","doi":"10.1145/1833280.1833284","DOIUrl":"https://doi.org/10.1145/1833280.1833284","url":null,"abstract":"Evolutionary clustering is an emerging research area addressing the problem of clustering dynamic data. An evolutionary clustering should take care of two conflicting criteria: preserving the current cluster quality and not deviating too much from the recent history. In this paper we propose an algorithm for evolutionary clustering using frequent itemsets. A frequent itemset based approach for evolutionary clustering is natural and it automatically satisfy the two criteria of evolutionary clustering. We provide theoretical as well as experimental proofs to support our claims. We performed experiments on our approach using different datasets and the results show that our approach is comparable to most of the existing algorithms for evolutionary clustering.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126108342","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
CALDS: context-aware learning from data streams CALDS:从数据流中进行上下文感知学习
StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833283
J. Gomes, Ernestina Menasalvas Ruiz, Pedro A. C. Sousa
{"title":"CALDS: context-aware learning from data streams","authors":"J. Gomes, Ernestina Menasalvas Ruiz, Pedro A. C. Sousa","doi":"10.1145/1833280.1833283","DOIUrl":"https://doi.org/10.1145/1833280.1833283","url":null,"abstract":"Drift detection methods in data streams can detect changes in incoming data so that learned models can be used to represent the underlying population. In many real-world scenarios context information is available and could be exploited to improve existing approaches, by detecting or even anticipating to recurring concepts in the underlying population. Several applications, among them health-care or recommender systems, lend themselves to use such information as data from sensors is available but is not being used. Nevertheless, new challenges arise when integrating context with drift detection methods. Modeling and comparing context information, representing the context-concepts history and storing previously learned concepts for reuse are some of the critical problems. In this work, we propose the Context-aware Learning from Data Streams (CALDS) system to improve existing drift detection methods by exploiting available context information. Our enhancement is seamless: we use the association between context information and learned concepts to improve detection and adaptation to drift when concepts reappear. We present and discuss our preliminary experimental results with synthetic and real datasets.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114566280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Towards subspace clustering on dynamic data: an incremental version of PreDeCon 动态数据上的子空间聚类:一个增量版本的前辈
StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833285
H. Kriegel, Peer Kröger, Eirini Ntoutsi, A. Zimek
{"title":"Towards subspace clustering on dynamic data: an incremental version of PreDeCon","authors":"H. Kriegel, Peer Kröger, Eirini Ntoutsi, A. Zimek","doi":"10.1145/1833280.1833285","DOIUrl":"https://doi.org/10.1145/1833280.1833285","url":null,"abstract":"Todays data are high dimensional and dynamic, thus clustering over such kind of data is rather complicated. To deal with the high dimensionality problem, the subspace clustering research area has lately emerged that aims at finding clusters in subspaces of the original feature space. So far, the subspace clustering methods are mainly static and thus, cannot address the dynamic nature of modern data. In this paper, we propose an incremental version of the density based projected clustering algorithm PreDeCon, called incPreDeCon. The proposed algorithm efficiently updates only those subspace clusters that might be affected due to the population update.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"221 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116011392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Visual analysis of news streams with article threads 可视化分析的新闻流与文章线程
StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833286
Milos Krstajic, E. Bertini, Florian Mansmann, D. Keim
{"title":"Visual analysis of news streams with article threads","authors":"Milos Krstajic, E. Bertini, Florian Mansmann, D. Keim","doi":"10.1145/1833280.1833286","DOIUrl":"https://doi.org/10.1145/1833280.1833286","url":null,"abstract":"The analysis of large quantities of news is an emerging area in the field of data analysis and visualization. International agencies collect thousands of news every day from a large number of sources and making sense of them is becoming increasingly complex due to the rate of the incoming news, as well as the inherent complexity of analyzing large quantities of evolving text corpora. Current visual techniques that deal with temporal evolution of such complex datasets, together with research efforts in related domains like text mining and topic detection and tracking, represent early attempts to understand, gain insight and make sense of these data. Despite these initial propositions, there is still a lack of techniques dealing directly with the problem of visualizing news streams in a \"on-line\" fashion, that is, in a way that the evolution of news can be monitored in real-time by the operator. In this paper we propose a purely visual technique that permits to see the evolution of news in real-time. The technique permits to show the stream of news as they enter into the system as well as a series of important threads which are computed on the fly. By merging single articles into threads, the technique permits to offload the visualization and retain only the most relevant information. The proposed technique is applied to the visualization of news streams generated by a news aggregation system that monitors over 4000 sites from 1600 key news portals world-wide and retrieves over 80000 reports per day in 43 languages.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"239 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124448504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Detecting outliers on arbitrary data streams using anytime approaches 使用任意时间方法检测任意数据流上的异常值
StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833282
I. Assent, P. Kranen, C. Baldauf, T. Seidl
{"title":"Detecting outliers on arbitrary data streams using anytime approaches","authors":"I. Assent, P. Kranen, C. Baldauf, T. Seidl","doi":"10.1145/1833280.1833282","DOIUrl":"https://doi.org/10.1145/1833280.1833282","url":null,"abstract":"Data streams are gaining importance in many sensoring and monitoring environments. Frequent mining tasks on data streams include classification, modeling and outlier detection. Since often the data arrival rates vary, anytime algorithms have been proposed for stream clustering and classification, which can deliver a fast first result and improve their result if more time is available. In this work, we propose the novel concept of anytime outlier detection and introduce an algorithm for anytime outlier detection based on a hierarchical cluster representation. We show promising results in preliminary experiments and discuss future research for anytime outlier detection.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"262 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122753054","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Research issues in mining multiple data streams 多数据流挖掘的研究问题
StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833288
Wenyan Wu, L. Gruenwald
{"title":"Research issues in mining multiple data streams","authors":"Wenyan Wu, L. Gruenwald","doi":"10.1145/1833280.1833288","DOIUrl":"https://doi.org/10.1145/1833280.1833288","url":null,"abstract":"There exist emerging applications of data streams that have mining requirements. Although single data stream mining has been extensively studied, little research has been done for mining multiple data streams (MDS), which are more complex than single data streams and involved in many real-world applications. This paper discusses the characteristics of MDS, proposes a formal definition for them, analyzes MDS application in terms of mining requirements, and identifies research issues for MDS mining.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116183400","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Conformal prediction for distribution-independent anomaly detection in streaming vessel data 流船数据中分布无关异常检测的保形预测
StreamKDD '10 Pub Date : 2010-07-25 DOI: 10.1145/1833280.1833287
Rikard Laxhammar, G. Falkman
{"title":"Conformal prediction for distribution-independent anomaly detection in streaming vessel data","authors":"Rikard Laxhammar, G. Falkman","doi":"10.1145/1833280.1833287","DOIUrl":"https://doi.org/10.1145/1833280.1833287","url":null,"abstract":"This paper presents a novel application of the theory of conformal prediction for distribution-independent on-line learning and anomaly detection. We exploit the fact that conformal predictors give valid prediction sets at specified confidence levels under the relatively weak assumption that the (normal) training data together with (normal) observations to be predicted have been generated from the same distribution. If the actual observation is not included in the possibly empty prediction set, it is classified as anomalous at the corresponding significance level. Interpreting the significance level as an upper bound of the probability that a normal observation is mistakenly classified as anomalous, we can conveniently adjust the sensitivity to anomalies while controlling the rate of false alarms without having to find any application specific thresholds. The proposed method has been evaluated in the domain of sea surveillance using recorded data assumed to be normal. The validity of the prediction sets is justified by the empirical error rate which is just below the significance level. In addition, experiments with simulated anomalous data indicate that anomaly detection sensitivity is superior to that of two previously proposed methods.","PeriodicalId":383372,"journal":{"name":"StreamKDD '10","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122561956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信