{"title":"A Novel k'-Means Algorithm for Clustering Analysis","authors":"Chonglun Fang, Jinwen Ma","doi":"10.1109/BMEI.2009.5304816","DOIUrl":null,"url":null,"abstract":"This paper proposes a novel k-means algorithm for clustering analysis for the cases that the true number of clusters in a data or points set is not known in advance. That is, assuming that the number of seed-points in the algorithm is set to be larger than the true number k of clusters in the data set, the proposed algorithm can assign the k seed-points to the actual clusters, respectively, with the extra seed-points corresponding to the empty clusters, i.e., having no winning points according to a newly defined distance. Via using the Mahalanobis distance, the proposed algorithm can be further extended to elliptical clustering analysis. It is demonstrated well by the experiments on simulated data set and the wine data that the proposed k- means algorithm can find the correct number of clusters in the sample data with a good correct classification rate. Moreover, the algorithm is successfully applied to unsupervised color image segmentation.","PeriodicalId":6389,"journal":{"name":"2009 2nd International Conference on Biomedical Engineering and Informatics","volume":"1 1","pages":"1-5"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 2nd International Conference on Biomedical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BMEI.2009.5304816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
This paper proposes a novel k-means algorithm for clustering analysis for the cases that the true number of clusters in a data or points set is not known in advance. That is, assuming that the number of seed-points in the algorithm is set to be larger than the true number k of clusters in the data set, the proposed algorithm can assign the k seed-points to the actual clusters, respectively, with the extra seed-points corresponding to the empty clusters, i.e., having no winning points according to a newly defined distance. Via using the Mahalanobis distance, the proposed algorithm can be further extended to elliptical clustering analysis. It is demonstrated well by the experiments on simulated data set and the wine data that the proposed k- means algorithm can find the correct number of clusters in the sample data with a good correct classification rate. Moreover, the algorithm is successfully applied to unsupervised color image segmentation.