{"title":"Novel technique for prediction analysis using normalization for an improvement in K-means clustering","authors":"Shruti Gupta, Abha Thakral, Shilpi Sharma","doi":"10.1109/INCITE.2016.7857584","DOIUrl":null,"url":null,"abstract":"Clustering is the unsupervised classification of spatterns in a dataset. Clustering is widely used to discover distributed patterns and classify them as clusters. Clustering algorithms uses a similarity measure based on distance. In order to cluster data points, k-means uses Euclidean distance measure and central point choice. In the K-means clustering, data points will be stacked and a central point is chosen. From the central point chosen, Euclidean distance will be computed and on that basis clusters will be assigned to the data points. One of the drawbacks of K-means is that numbers of clusters has to be provided due to which some data points remains un-clustered. In this paper, we propose a clustering calculation through which number of clusters can be characterised naturally. The proposed technique will improve accuracy and decrease clustering time moreover cluster quality will also be improved through multiple iterations.","PeriodicalId":59618,"journal":{"name":"下一代","volume":"30 1","pages":"32-36"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"下一代","FirstCategoryId":"1092","ListUrlMain":"https://doi.org/10.1109/INCITE.2016.7857584","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Clustering is the unsupervised classification of spatterns in a dataset. Clustering is widely used to discover distributed patterns and classify them as clusters. Clustering algorithms uses a similarity measure based on distance. In order to cluster data points, k-means uses Euclidean distance measure and central point choice. In the K-means clustering, data points will be stacked and a central point is chosen. From the central point chosen, Euclidean distance will be computed and on that basis clusters will be assigned to the data points. One of the drawbacks of K-means is that numbers of clusters has to be provided due to which some data points remains un-clustered. In this paper, we propose a clustering calculation through which number of clusters can be characterised naturally. The proposed technique will improve accuracy and decrease clustering time moreover cluster quality will also be improved through multiple iterations.