{"title":"The method of clusters stability assessing","authors":"Aleksejs Lozkins, V. Bure","doi":"10.1109/SCP.2015.7342177","DOIUrl":null,"url":null,"abstract":"The article proposes a method for estimation of “correct” number of clusters. The stability methodology is used and quantative stability level assessment is introduced. This problem is important and common in cluster analysis and do not have unique criterion for all dataset types. The suggested approach solves one more problem. Often in socio-economics information analysis there are situations when initial numerical data contains various kinds of inaccuracies (measurement errors, intentional misrepresentations, errors of calculation, errors of mathematical model and other possible sources of errors). In this regard, it is important to choose those classifications of study objects which have the stability property with respect to random noise. The work is aimed to get a reliable and accurate clustering which is stable with respect to random perturbations and solves “ill posed” problems in clustering analysis, i.e. to find suggested number of clusters. In current paper the probabilistic approach to problem resolving is being offered. The variability frequency based on random perturbation is introduced and examined as a main metric for assessing clustering results. This estimation can be used for different clustering algorithms and their stability indices can be compared without additional procedures. The experiment on artificial data using the k-mean clustering approach is carried out.","PeriodicalId":110366,"journal":{"name":"2015 International Conference \"Stability and Control Processes\" in Memory of V.I. Zubov (SCP)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 International Conference \"Stability and Control Processes\" in Memory of V.I. Zubov (SCP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCP.2015.7342177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
The article proposes a method for estimation of “correct” number of clusters. The stability methodology is used and quantative stability level assessment is introduced. This problem is important and common in cluster analysis and do not have unique criterion for all dataset types. The suggested approach solves one more problem. Often in socio-economics information analysis there are situations when initial numerical data contains various kinds of inaccuracies (measurement errors, intentional misrepresentations, errors of calculation, errors of mathematical model and other possible sources of errors). In this regard, it is important to choose those classifications of study objects which have the stability property with respect to random noise. The work is aimed to get a reliable and accurate clustering which is stable with respect to random perturbations and solves “ill posed” problems in clustering analysis, i.e. to find suggested number of clusters. In current paper the probabilistic approach to problem resolving is being offered. The variability frequency based on random perturbation is introduced and examined as a main metric for assessing clustering results. This estimation can be used for different clustering algorithms and their stability indices can be compared without additional procedures. The experiment on artificial data using the k-mean clustering approach is carried out.