An Empirical Comparison of Stream Clustering Algorithms

Proceedings of the Computing Frontiers Conference Pub Date : 2017-05-15 DOI:10.1145/3075564.3078887

Matthias Carnein, Dennis Assenmacher, H. Trautmann

引用次数: 35

Abstract

Analysing streaming data has received considerable attention over the recent years. A key research area in this field is stream clustering which aims to recognize patterns in a possibly unbounded data stream of varying speed and structure. Over the past decades a multitude of new stream clustering algorithms have been proposed. However, to the best of our knowledge, no rigorous analysis and comparison of the different approaches has been performed. Our paper fills this gap and provides extensive experiments for a total of ten popular algorithms. We utilize a number of standard data sets of both, real and synthetic data and identify key weaknesses and strengths of the existing algorithms.

查看原文本刊更多论文

流聚类算法的经验比较

分析流数据近年来受到了相当大的关注。该领域的一个关键研究领域是流聚类，它旨在识别速度和结构变化的可能无界的数据流中的模式。在过去的几十年里，人们提出了许多新的流聚类算法。然而，据我们所知，还没有对不同的方法进行严格的分析和比较。我们的论文填补了这一空白，并为总共十种流行的算法提供了广泛的实验。我们利用了大量的标准数据集，包括真实数据和合成数据，并确定了现有算法的主要弱点和优势。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Computing Frontiers Conference

自引率

0.00%

发文量