{"title":"Cluster validation for subspace clustering on high dimensional data","authors":"Lifei Chen, Q. Jiang, Shengrui Wang","doi":"10.1109/APCCAS.2008.4746001","DOIUrl":null,"url":null,"abstract":"As an important issue in cluster analysis, cluster validation is the process of evaluating performance of clustering algorithms under varying input conditions. Many existing methods address clustering results of low-dimensional data. This paper presents new solution to the problem of cluster validation for subspace clustering on high dimensional data. We first propose two new measurements for the intra-cluster compactness and inter-cluster separation of subspace clusters. Based on these measurements and the conventional indices, three new cluster validity indices that can be applied to subspace clustering are presented. Combining with a soft subspace clustering algorithm, the new indices are used to determine the number of clusters in high dimensional data. The experimental results on synthetic and real world datasets have shown their effectiveness.","PeriodicalId":344917,"journal":{"name":"APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"APCCAS 2008 - 2008 IEEE Asia Pacific Conference on Circuits and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/APCCAS.2008.4746001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
As an important issue in cluster analysis, cluster validation is the process of evaluating performance of clustering algorithms under varying input conditions. Many existing methods address clustering results of low-dimensional data. This paper presents new solution to the problem of cluster validation for subspace clustering on high dimensional data. We first propose two new measurements for the intra-cluster compactness and inter-cluster separation of subspace clusters. Based on these measurements and the conventional indices, three new cluster validity indices that can be applied to subspace clustering are presented. Combining with a soft subspace clustering algorithm, the new indices are used to determine the number of clusters in high dimensional data. The experimental results on synthetic and real world datasets have shown their effectiveness.