Two-stage time-series clustering approach under reducing time cost requirement

2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET) Pub Date : 2020-02-01 DOI:10.1109/TCSET49122.2020.235513

N. Manakova, V. Tkachenko

引用次数: 2

Abstract

Clustering is an essential task of unsupervised learning, which is valuable as a specific data mining tool and as an auxiliary stage of numerous highly demanded tasks, including recognizing structures, tuning of forecast parameters, detecting anomalies, and others. Significantly data-driven, especially of specific data such as time-series considered here, as well as with an impressive growth of the volume data, the computational cost becomes a vital critical issue. In the research presented, the authors developed a two-step approach to clustering based on the split of a massive dataset into two unequal parts under the control of the clusterability metric through the instance-based and feature-based combination of time-series clustering. The conducted experimental study on the well-known test data set confirmed the competitiveness of the proposed method under the conditions of the requirement to reduce time costs.

查看原文本刊更多论文

降低时间成本要求的两阶段时间序列聚类方法

聚类是无监督学习的一项基本任务，它是一种有价值的特定数据挖掘工具，也是许多高要求任务的辅助阶段，包括识别结构、调整预测参数、检测异常等。在数据驱动的情况下，特别是对于特定的数据，如本文所考虑的时间序列，以及随着数据量的惊人增长，计算成本成为一个至关重要的关键问题。在本文的研究中，作者通过基于实例和基于特征的时间序列聚类相结合，在可聚性度量的控制下，将大量数据集分成两个不相等的部分，提出了一种两步聚类方法。通过对已知测试数据集的实验研究，验证了所提方法在降低时间成本的要求下的竞争力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE 15th International Conference on Advanced Trends in Radioelectronics, Telecommunications and Computer Engineering (TCSET)

自引率

0.00%

发文量