Comparative Analysis of Hybrid Clustering Algorithm on Different Dataset

H. Malik, N. Laghari, D. Sangrasi, Z. Dayo
{"title":"Comparative Analysis of Hybrid Clustering Algorithm on Different Dataset","authors":"H. Malik, N. Laghari, D. Sangrasi, Z. Dayo","doi":"10.1109/ICEIEC.2018.8473568","DOIUrl":null,"url":null,"abstract":"Clustering is a data mining technique, in which data is grouped based on similarity and dissimilarity. Clustering is usually used to identify hidden pattern in multidimensional complex data and, these hidden pattern provide bases for making decisions. The objective of this research to find the best clustering algorithm. K-Mean is a famous clustering algorithm, which is simple and easy to implement, but the drawback of K-Mean is that, it does not work with higher dimensional data, for aiding with this drawback K-Mean is fused with other clustering algorithms such as PSO (Particle Swarm optimization) and PCA (Principle Component Analysis) for better results and cluster identifications. In this paper authors used hybrid clustering approach (K-Mean, PSO-K-Mean and PCA-K-Mean) to improve the clustering result based on parameter (Purity, Rand index and Computation Time) on different data sets taken from UCI Repository.","PeriodicalId":344233,"journal":{"name":"2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIEC.2018.8473568","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

Clustering is a data mining technique, in which data is grouped based on similarity and dissimilarity. Clustering is usually used to identify hidden pattern in multidimensional complex data and, these hidden pattern provide bases for making decisions. The objective of this research to find the best clustering algorithm. K-Mean is a famous clustering algorithm, which is simple and easy to implement, but the drawback of K-Mean is that, it does not work with higher dimensional data, for aiding with this drawback K-Mean is fused with other clustering algorithms such as PSO (Particle Swarm optimization) and PCA (Principle Component Analysis) for better results and cluster identifications. In this paper authors used hybrid clustering approach (K-Mean, PSO-K-Mean and PCA-K-Mean) to improve the clustering result based on parameter (Purity, Rand index and Computation Time) on different data sets taken from UCI Repository.
混合聚类算法在不同数据集上的比较分析
聚类是一种数据挖掘技术,它将数据根据相似度和不相似度进行分组。聚类通常用于识别多维复杂数据中的隐藏模式,这些隐藏模式为决策提供依据。本研究的目的是寻找最佳的聚类算法。K-Mean是一种著名的聚类算法,它简单且易于实现,但K-Mean的缺点是它不适用于高维数据,为了弥补这一缺点,K-Mean与其他聚类算法如PSO (Particle Swarm optimization)和PCA (principal Component Analysis)相融合,以获得更好的结果和聚类识别。本文采用混合聚类方法(K-Mean、PSO-K-Mean和PCA-K-Mean)对UCI Repository中不同数据集基于参数(纯度、Rand指数和计算时间)的聚类结果进行改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信