大型数据集的模糊聚类分析

Scalable Fuzzy Algorithms for Data Management and Analysis Pub Date : 2009-10-01 DOI:10.4018/978-1-60566-858-1.CH012

R. Winkler, F. Klawonn, F. Höppner, R. Kruse

{"title":"大型数据集的模糊聚类分析","authors":"R. Winkler, F. Klawonn, F. Höppner, R. Kruse","doi":"10.4018/978-1-60566-858-1.CH012","DOIUrl":null,"url":null,"abstract":"The application of fuzzy cluster analysis to larger data sets can cause runtime and memory overflow problems. While deterministic or hard clustering assigns a data object to a unique cluster, fuzzy clustering distributes the membership of a data object over different clusters. In standard fuzzy clustering, membership degrees will (almost) never become zero, so that all data objects are assigned to − even with very small membership degrees − all clusters. As a consequence, this does not only demand higher computational and memory power, it also leads to the undesired effect that all data objects will always influence all clusters, no matter how far away they are from a cluster. New approaches, modifying the idea of the fuzzifier, have been developed to avoid the problem of nonzero membership degrees for all data and clusters. In this paper, these ideas will be combined with concepts of speeding up fuzzy clustering by a suitable data organization, so that fuzzy clustering can be applied more efficiently to larger data sets.","PeriodicalId":293388,"journal":{"name":"Scalable Fuzzy Algorithms for Data Management and Analysis","volume":"157 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Fuzzy Cluster Analysis of Larger Data Sets\",\"authors\":\"R. Winkler, F. Klawonn, F. Höppner, R. Kruse\",\"doi\":\"10.4018/978-1-60566-858-1.CH012\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The application of fuzzy cluster analysis to larger data sets can cause runtime and memory overflow problems. While deterministic or hard clustering assigns a data object to a unique cluster, fuzzy clustering distributes the membership of a data object over different clusters. In standard fuzzy clustering, membership degrees will (almost) never become zero, so that all data objects are assigned to − even with very small membership degrees − all clusters. As a consequence, this does not only demand higher computational and memory power, it also leads to the undesired effect that all data objects will always influence all clusters, no matter how far away they are from a cluster. New approaches, modifying the idea of the fuzzifier, have been developed to avoid the problem of nonzero membership degrees for all data and clusters. In this paper, these ideas will be combined with concepts of speeding up fuzzy clustering by a suitable data organization, so that fuzzy clustering can be applied more efficiently to larger data sets.\",\"PeriodicalId\":293388,\"journal\":{\"name\":\"Scalable Fuzzy Algorithms for Data Management and Analysis\",\"volume\":\"157 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Scalable Fuzzy Algorithms for Data Management and Analysis\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4018/978-1-60566-858-1.CH012\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Scalable Fuzzy Algorithms for Data Management and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4018/978-1-60566-858-1.CH012","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

将模糊聚类分析应用于较大的数据集可能会导致运行时和内存溢出问题。确定性聚类或硬聚类将数据对象分配给唯一的集群，而模糊聚类将数据对象的隶属关系分配给不同的集群。在标准模糊聚类中，隶属度(几乎)永远不会变为零，因此所有数据对象都被分配到所有聚类中，即使隶属度非常小。因此，这不仅需要更高的计算和内存能力，而且还会导致不希望的结果，即所有数据对象总是会影响所有集群，无论它们离集群有多远。为了避免所有数据和聚类的隶属度不为零的问题，改进了模糊化的思想，提出了新的方法。在本文中，这些思想将通过适当的数据组织与加速模糊聚类的概念相结合，使模糊聚类能够更有效地应用于更大的数据集。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fuzzy Cluster Analysis of Larger Data Sets

The application of fuzzy cluster analysis to larger data sets can cause runtime and memory overflow problems. While deterministic or hard clustering assigns a data object to a unique cluster, fuzzy clustering distributes the membership of a data object over different clusters. In standard fuzzy clustering, membership degrees will (almost) never become zero, so that all data objects are assigned to − even with very small membership degrees − all clusters. As a consequence, this does not only demand higher computational and memory power, it also leads to the undesired effect that all data objects will always influence all clusters, no matter how far away they are from a cluster. New approaches, modifying the idea of the fuzzifier, have been developed to avoid the problem of nonzero membership degrees for all data and clusters. In this paper, these ideas will be combined with concepts of speeding up fuzzy clustering by a suitable data organization, so that fuzzy clustering can be applied more efficiently to larger data sets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Scalable Fuzzy Algorithms for Data Management and Analysis

自引率

0.00%

发文量