基于优化算法的高效k -均值聚类初始化

2019 International Conference on Advances in Computing and Communication Engineering (ICACCE) Pub Date : 2019-04-01 DOI:10.1109/ICACCE46606.2019.9079998

V. Divya, R. Deepika, C. Yamini, P. Sobiyaa

{"title":"基于优化算法的高效k -均值聚类初始化","authors":"V. Divya, R. Deepika, C. Yamini, P. Sobiyaa","doi":"10.1109/ICACCE46606.2019.9079998","DOIUrl":null,"url":null,"abstract":"In data mining has a lot of technique for knowledge discovery. In this Clustering method is very well technique for unsupervised learning. It's important objective is to find a high-quality cluster where the distance between clusters are maximal and the distance in the cluster is minimal. K-means algorithm is applied in this paper for its simplicity. It has been widely discussed and applied in pattern recognition and machine learning. However, the K-means algorithm could not guarantee unique clustering results for the same dataset because its initial cluster centers are select randomly. To avoid such issues a new initialization method is proposed in the Improved K-means algorithm with Cuckoo Search algorithm. The proposed method uses different numerical datasets like iris, wine and solar datasets (Ames, Chariton stations). The K-means clustering solutions are comparable with cuckoo search initialization methods using different measures such as Accuracy, Precision and Recall, F1-score, Silhouette value and MSE (Mean Square Error). The experimental solution represents the effectiveness of the proposed method.","PeriodicalId":317123,"journal":{"name":"2019 International Conference on Advances in Computing and Communication Engineering (ICACCE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"An Efficient K-Means Clustering Initialization Using Optimization Algorithm\",\"authors\":\"V. Divya, R. Deepika, C. Yamini, P. Sobiyaa\",\"doi\":\"10.1109/ICACCE46606.2019.9079998\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In data mining has a lot of technique for knowledge discovery. In this Clustering method is very well technique for unsupervised learning. It's important objective is to find a high-quality cluster where the distance between clusters are maximal and the distance in the cluster is minimal. K-means algorithm is applied in this paper for its simplicity. It has been widely discussed and applied in pattern recognition and machine learning. However, the K-means algorithm could not guarantee unique clustering results for the same dataset because its initial cluster centers are select randomly. To avoid such issues a new initialization method is proposed in the Improved K-means algorithm with Cuckoo Search algorithm. The proposed method uses different numerical datasets like iris, wine and solar datasets (Ames, Chariton stations). The K-means clustering solutions are comparable with cuckoo search initialization methods using different measures such as Accuracy, Precision and Recall, F1-score, Silhouette value and MSE (Mean Square Error). The experimental solution represents the effectiveness of the proposed method.\",\"PeriodicalId\":317123,\"journal\":{\"name\":\"2019 International Conference on Advances in Computing and Communication Engineering (ICACCE)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on Advances in Computing and Communication Engineering (ICACCE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICACCE46606.2019.9079998\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on Advances in Computing and Communication Engineering (ICACCE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICACCE46606.2019.9079998","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

在数据挖掘中有大量的知识发现技术。在这种情况下，聚类方法是一种很好的无监督学习技术。一个重要的目标是找到一个高质量的集群，集群之间的距离是最大的，集群之间的距离是最小的。由于K-means算法的简单性，本文采用了它。它在模式识别和机器学习中得到了广泛的讨论和应用。然而，由于K-means算法的初始聚类中心是随机选择的，因此不能保证同一数据集的聚类结果是唯一的。为了避免这些问题，本文提出了一种基于布谷鸟搜索算法的改进K-means算法的初始化方法。提出的方法使用不同的数值数据集，如虹膜，葡萄酒和太阳数据集(Ames, Chariton站)。K-means聚类解决方案与布谷鸟搜索初始化方法具有可比性，使用不同的度量，如准确性、精度和召回率、f1分数、剪影值和均方误差(MSE)。实验结果表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Efficient K-Means Clustering Initialization Using Optimization Algorithm

In data mining has a lot of technique for knowledge discovery. In this Clustering method is very well technique for unsupervised learning. It's important objective is to find a high-quality cluster where the distance between clusters are maximal and the distance in the cluster is minimal. K-means algorithm is applied in this paper for its simplicity. It has been widely discussed and applied in pattern recognition and machine learning. However, the K-means algorithm could not guarantee unique clustering results for the same dataset because its initial cluster centers are select randomly. To avoid such issues a new initialization method is proposed in the Improved K-means algorithm with Cuckoo Search algorithm. The proposed method uses different numerical datasets like iris, wine and solar datasets (Ames, Chariton stations). The K-means clustering solutions are comparable with cuckoo search initialization methods using different measures such as Accuracy, Precision and Recall, F1-score, Silhouette value and MSE (Mean Square Error). The experimental solution represents the effectiveness of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 International Conference on Advances in Computing and Communication Engineering (ICACCE)

自引率

0.00%

发文量