{"title":"Comparative Analysis of Hybrid Clustering Algorithm on Different Dataset","authors":"H. Malik, N. Laghari, D. Sangrasi, Z. Dayo","doi":"10.1109/ICEIEC.2018.8473568","DOIUrl":null,"url":null,"abstract":"Clustering is a data mining technique, in which data is grouped based on similarity and dissimilarity. Clustering is usually used to identify hidden pattern in multidimensional complex data and, these hidden pattern provide bases for making decisions. The objective of this research to find the best clustering algorithm. K-Mean is a famous clustering algorithm, which is simple and easy to implement, but the drawback of K-Mean is that, it does not work with higher dimensional data, for aiding with this drawback K-Mean is fused with other clustering algorithms such as PSO (Particle Swarm optimization) and PCA (Principle Component Analysis) for better results and cluster identifications. In this paper authors used hybrid clustering approach (K-Mean, PSO-K-Mean and PCA-K-Mean) to improve the clustering result based on parameter (Purity, Rand index and Computation Time) on different data sets taken from UCI Repository.","PeriodicalId":344233,"journal":{"name":"2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 8th International Conference on Electronics Information and Emergency Communication (ICEIEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICEIEC.2018.8473568","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
Clustering is a data mining technique, in which data is grouped based on similarity and dissimilarity. Clustering is usually used to identify hidden pattern in multidimensional complex data and, these hidden pattern provide bases for making decisions. The objective of this research to find the best clustering algorithm. K-Mean is a famous clustering algorithm, which is simple and easy to implement, but the drawback of K-Mean is that, it does not work with higher dimensional data, for aiding with this drawback K-Mean is fused with other clustering algorithms such as PSO (Particle Swarm optimization) and PCA (Principle Component Analysis) for better results and cluster identifications. In this paper authors used hybrid clustering approach (K-Mean, PSO-K-Mean and PCA-K-Mean) to improve the clustering result based on parameter (Purity, Rand index and Computation Time) on different data sets taken from UCI Repository.