{"title":"FCM算法的验证方法","authors":"E. Říhová, David Ríha","doi":"10.18267/pr.2019.los.186.128","DOIUrl":null,"url":null,"abstract":"Clustering techniques can be used to organize into groups based on similarities among the individual data. In other words, clustering techniques are tools for discovering the previously hidden structure in a set, where the objects from one cluster are as similar as possible and objects from different clusters are dissimilar as possible. There are many different coefficients for estimating the optimal number of clusters. Each of these coefficients has its strengths and weaknesses. In this research, several coefficients for estimating the optimal number of clusters (for fuzzy clustering techniques) are examined. Also, their strengths and weaknesses are studied. And finally, the new coefficient for evaluating the fuzzy C-means clustering results is presented. The proposed coefficient is compared with a number of popular validation indices on nine datasets. The experimental results show that the effectiveness and reliability of the proposal is superior to other indices. The main advantage of this new coefficient is that, it works correct on data sets with large and small number of clusters. This characteristic of the new coefficient is very significant, as this algorithm require the number of clusters as an input, and the analysis result can vary greatly depending on the value chosen for this variable.","PeriodicalId":235267,"journal":{"name":"International Days of Statistics and Economics 2019","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Validation approaches for FCM algorithm\",\"authors\":\"E. Říhová, David Ríha\",\"doi\":\"10.18267/pr.2019.los.186.128\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Clustering techniques can be used to organize into groups based on similarities among the individual data. In other words, clustering techniques are tools for discovering the previously hidden structure in a set, where the objects from one cluster are as similar as possible and objects from different clusters are dissimilar as possible. There are many different coefficients for estimating the optimal number of clusters. Each of these coefficients has its strengths and weaknesses. In this research, several coefficients for estimating the optimal number of clusters (for fuzzy clustering techniques) are examined. Also, their strengths and weaknesses are studied. And finally, the new coefficient for evaluating the fuzzy C-means clustering results is presented. The proposed coefficient is compared with a number of popular validation indices on nine datasets. The experimental results show that the effectiveness and reliability of the proposal is superior to other indices. The main advantage of this new coefficient is that, it works correct on data sets with large and small number of clusters. This characteristic of the new coefficient is very significant, as this algorithm require the number of clusters as an input, and the analysis result can vary greatly depending on the value chosen for this variable.\",\"PeriodicalId\":235267,\"journal\":{\"name\":\"International Days of Statistics and Economics 2019\",\"volume\":\"18 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Days of Statistics and Economics 2019\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.18267/pr.2019.los.186.128\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Days of Statistics and Economics 2019","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18267/pr.2019.los.186.128","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Clustering techniques can be used to organize into groups based on similarities among the individual data. In other words, clustering techniques are tools for discovering the previously hidden structure in a set, where the objects from one cluster are as similar as possible and objects from different clusters are dissimilar as possible. There are many different coefficients for estimating the optimal number of clusters. Each of these coefficients has its strengths and weaknesses. In this research, several coefficients for estimating the optimal number of clusters (for fuzzy clustering techniques) are examined. Also, their strengths and weaknesses are studied. And finally, the new coefficient for evaluating the fuzzy C-means clustering results is presented. The proposed coefficient is compared with a number of popular validation indices on nine datasets. The experimental results show that the effectiveness and reliability of the proposal is superior to other indices. The main advantage of this new coefficient is that, it works correct on data sets with large and small number of clusters. This characteristic of the new coefficient is very significant, as this algorithm require the number of clusters as an input, and the analysis result can vary greatly depending on the value chosen for this variable.