{"title":"A Fast Deterministic Kmeans Initialization","authors":"O. Kettani, F. Ramdani","doi":"10.5120/IJAIS2017451683","DOIUrl":null,"url":null,"abstract":"The k-means algorithm remains one of the most widely used clustering methods, in spite of its sensitivity to the initial settings. This paper explores a simple, computationally low, deterministic method which provides k-means with initial seeds to cluster a given data set. It is simply based on computing the means of k samples with equal parts taken from the given data set. We test and compare this method to the related well know kkz initialization algorithm for kmeans, using both simulated and real data, and find it to be more efficient in many cases. General Terms Data Mining,Clustering.","PeriodicalId":92376,"journal":{"name":"International journal of applied information systems","volume":"13 1","pages":"6-11"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of applied information systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5120/IJAIS2017451683","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The k-means algorithm remains one of the most widely used clustering methods, in spite of its sensitivity to the initial settings. This paper explores a simple, computationally low, deterministic method which provides k-means with initial seeds to cluster a given data set. It is simply based on computing the means of k samples with equal parts taken from the given data set. We test and compare this method to the related well know kkz initialization algorithm for kmeans, using both simulated and real data, and find it to be more efficient in many cases. General Terms Data Mining,Clustering.