A K-means clustering with optimized initial center based on Hadoop platform

Kunhui Lin, Xiang Li, Zhongnan Zhang, Jiahong Chen
{"title":"A K-means clustering with optimized initial center based on Hadoop platform","authors":"Kunhui Lin, Xiang Li, Zhongnan Zhang, Jiahong Chen","doi":"10.1109/ICCSE.2014.6926466","DOIUrl":null,"url":null,"abstract":"With the explosive growth of data, the traditional clustering algorithms running on separate servers can not meet the demand. To solve the problem, more and more researchers implement the traditional clustering algorithms on the cloud computing platforms, especially for K-means clustering. But, few researchers pay attention to the K-means clustering structure, and most of researchers optimized the model of the cloud computing platform to raise the computing speed of K-means clustering. However the problem of instability caused by the random initial centers still exists. In this paper, we propose a K-means clustering algorithm with optimized initial centers based on data dimensional density. This method avoids the deficiency of the random initial centers and improves the stability of the K-means clustering. The experimental results show that the approach achieves a good performance on K-means, and improves the accuracy of K-means clustering on the test set.","PeriodicalId":275003,"journal":{"name":"2014 9th International Conference on Computer Science & Education","volume":"188 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 9th International Conference on Computer Science & Education","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE.2014.6926466","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17

Abstract

With the explosive growth of data, the traditional clustering algorithms running on separate servers can not meet the demand. To solve the problem, more and more researchers implement the traditional clustering algorithms on the cloud computing platforms, especially for K-means clustering. But, few researchers pay attention to the K-means clustering structure, and most of researchers optimized the model of the cloud computing platform to raise the computing speed of K-means clustering. However the problem of instability caused by the random initial centers still exists. In this paper, we propose a K-means clustering algorithm with optimized initial centers based on data dimensional density. This method avoids the deficiency of the random initial centers and improves the stability of the K-means clustering. The experimental results show that the approach achieves a good performance on K-means, and improves the accuracy of K-means clustering on the test set.
基于Hadoop平台优化初始中心的K-means聚类
随着数据量的爆炸式增长,传统的在独立服务器上运行的聚类算法已经不能满足需求。为了解决这一问题,越来越多的研究者在云计算平台上实现了传统的聚类算法,特别是K-means聚类算法。但是,很少有研究人员关注K-means聚类结构,大多数研究人员对云计算平台的模型进行优化,以提高K-means聚类的计算速度。但是随机初始中心引起的不稳定性问题仍然存在。本文提出了一种基于数据维密度优化初始中心的k均值聚类算法。该方法避免了随机初始中心的不足,提高了K-means聚类的稳定性。实验结果表明,该方法在K-means上取得了较好的性能,提高了K-means聚类在测试集上的准确率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信