{"title":"基于多聚类和密度的改进K-means算法","authors":"Yulong Ling, Xiao Zhang","doi":"10.1145/3457682.3457695","DOIUrl":null,"url":null,"abstract":"The initial clustering center set of the k-means algorithm is randomly selected, which leads to unstable clustering results. To address this shortcoming, many improved k-means algorithms based on density have propersed, but the time complexity of these algorithms is too high. In order to improve clustering stability and reduce the clustering time, this paper proposes an improved algorithm based on multiple clustering and density. This algorithm firstly calls the k-means algorithm for many time, and adaptively selects excellent sample set according to the distance between samples and the corresponding cluster center. Then the initial cluster center set is selected according to the principle of the furthest distance and high density. The experiment on the UCI data sets shows that the algorithm in this paper not only improves the performance but also ensures the stability of clustering result compared with the k-means algorithm and the kmeans++ algorithm. Compare to improved density-based k-means algorithms, the proposed algorithm can greatly save the clustering time.","PeriodicalId":142045,"journal":{"name":"2021 13th International Conference on Machine Learning and Computing","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"An Improved K-means Algorithm Based on Multiple Clustering and Density\",\"authors\":\"Yulong Ling, Xiao Zhang\",\"doi\":\"10.1145/3457682.3457695\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The initial clustering center set of the k-means algorithm is randomly selected, which leads to unstable clustering results. To address this shortcoming, many improved k-means algorithms based on density have propersed, but the time complexity of these algorithms is too high. In order to improve clustering stability and reduce the clustering time, this paper proposes an improved algorithm based on multiple clustering and density. This algorithm firstly calls the k-means algorithm for many time, and adaptively selects excellent sample set according to the distance between samples and the corresponding cluster center. Then the initial cluster center set is selected according to the principle of the furthest distance and high density. The experiment on the UCI data sets shows that the algorithm in this paper not only improves the performance but also ensures the stability of clustering result compared with the k-means algorithm and the kmeans++ algorithm. Compare to improved density-based k-means algorithms, the proposed algorithm can greatly save the clustering time.\",\"PeriodicalId\":142045,\"journal\":{\"name\":\"2021 13th International Conference on Machine Learning and Computing\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-02-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 13th International Conference on Machine Learning and Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3457682.3457695\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 13th International Conference on Machine Learning and Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3457682.3457695","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An Improved K-means Algorithm Based on Multiple Clustering and Density
The initial clustering center set of the k-means algorithm is randomly selected, which leads to unstable clustering results. To address this shortcoming, many improved k-means algorithms based on density have propersed, but the time complexity of these algorithms is too high. In order to improve clustering stability and reduce the clustering time, this paper proposes an improved algorithm based on multiple clustering and density. This algorithm firstly calls the k-means algorithm for many time, and adaptively selects excellent sample set according to the distance between samples and the corresponding cluster center. Then the initial cluster center set is selected according to the principle of the furthest distance and high density. The experiment on the UCI data sets shows that the algorithm in this paper not only improves the performance but also ensures the stability of clustering result compared with the k-means algorithm and the kmeans++ algorithm. Compare to improved density-based k-means algorithms, the proposed algorithm can greatly save the clustering time.