A Fast Deterministic Kmeans Initialization

International journal of applied information systems Pub Date : 2017-05-05 DOI:10.5120/IJAIS2017451683

O. Kettani, F. Ramdani

引用次数: 2

Abstract

The k-means algorithm remains one of the most widely used clustering methods, in spite of its sensitivity to the initial settings. This paper explores a simple, computationally low, deterministic method which provides k-means with initial seeds to cluster a given data set. It is simply based on computing the means of k samples with equal parts taken from the given data set. We test and compare this method to the related well know kkz initialization algorithm for kmeans, using both simulated and real data, and find it to be more efficient in many cases. General Terms Data Mining,Clustering.

查看原文本刊更多论文

快速确定性Kmeans初始化

尽管k-means算法对初始设置很敏感，但它仍然是最广泛使用的聚类方法之一。本文探讨了一种简单，计算量低，确定性的方法，该方法为k-means提供初始种子来聚类给定数据集。它只是基于计算k个样本的均值，从给定的数据集中取等量的样本。我们使用模拟数据和真实数据对该方法与相关的kkz初始化算法进行了测试和比较，发现该方法在许多情况下更有效。数据挖掘，聚类。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International journal of applied information systems

自引率

0.00%

发文量