基于协同结构学习和保存的可扩展模糊聚类

IF 11.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Fuzzy Systems Pub Date : 2025-09-03 DOI:10.1109/TFUZZ.2025.3581679

Bingbing Jiang;Chenglong Zhang;Zhongli Wang;Xinyan Liang;Peng Zhou;Liang Du;Qinghua Zhang;Weiping Ding;Yi Liu

{"title":"基于协同结构学习和保存的可扩展模糊聚类","authors":"Bingbing Jiang;Chenglong Zhang;Zhongli Wang;Xinyan Liang;Peng Zhou;Liang Du;Qinghua Zhang;Weiping Ding;Yi Liu","doi":"10.1109/TFUZZ.2025.3581679","DOIUrl":null,"url":null,"abstract":"To partition samples into distinct clusters, Fuzzy C-Means (FCM) calculates the membership degrees of samples to cluster centers and provides soft labels, gaining significant attention in recent years. However, existing FCM methods encounter the following challenges. First, traditional FCM focuses on learning membership degrees, neglecting the data similarity structures. Second, graph-based FCM typically separates graph construction from clustering, overlooking the knowledge interaction between graphs and clustering, obtaining suboptimal performance. Third, exploring the similarity structures among all samples is computationally expensive for large-scale tasks. To solve these dilemmas, we propose a scalable fuzzy clustering with collaborative structure learning and preservation (CSLP), which simultaneously leverages both cluster information and similarity structures to learn an optimal membership degree representation. Specifically, a self-weighted manner is devised to measure the sample importance, thereby reducing the adverse impacts of outliers. Moreover, the graph is updated according to the data similarities in the membership degree representation, such that CSLP collaboratively learns the graph and membership degrees in a mutually reinforcing manner. Thus, the similarity structures are fully explored during clustering processes and preserved in the learned membership degrees, enhancing the discrimination of clustering labels. To further improve efficiency, an acceleration solution is developed to reduce the computational cost of CSLP by propagating membership degrees from potential centers to samples, making CSLP scalable for large-scale tasks. An iterative strategy is designed to solve the formulated objective function. Extensive experiments demonstrate that CSLP outperforms other fuzzy clustering methods in terms of both effectiveness and scalability.","PeriodicalId":13212,"journal":{"name":"IEEE Transactions on Fuzzy Systems","volume":"33 9","pages":"3047-3060"},"PeriodicalIF":11.9000,"publicationDate":"2025-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Scalable Fuzzy Clustering With Collaborative Structure Learning and Preservation\",\"authors\":\"Bingbing Jiang;Chenglong Zhang;Zhongli Wang;Xinyan Liang;Peng Zhou;Liang Du;Qinghua Zhang;Weiping Ding;Yi Liu\",\"doi\":\"10.1109/TFUZZ.2025.3581679\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To partition samples into distinct clusters, Fuzzy C-Means (FCM) calculates the membership degrees of samples to cluster centers and provides soft labels, gaining significant attention in recent years. However, existing FCM methods encounter the following challenges. First, traditional FCM focuses on learning membership degrees, neglecting the data similarity structures. Second, graph-based FCM typically separates graph construction from clustering, overlooking the knowledge interaction between graphs and clustering, obtaining suboptimal performance. Third, exploring the similarity structures among all samples is computationally expensive for large-scale tasks. To solve these dilemmas, we propose a scalable fuzzy clustering with collaborative structure learning and preservation (CSLP), which simultaneously leverages both cluster information and similarity structures to learn an optimal membership degree representation. Specifically, a self-weighted manner is devised to measure the sample importance, thereby reducing the adverse impacts of outliers. Moreover, the graph is updated according to the data similarities in the membership degree representation, such that CSLP collaboratively learns the graph and membership degrees in a mutually reinforcing manner. Thus, the similarity structures are fully explored during clustering processes and preserved in the learned membership degrees, enhancing the discrimination of clustering labels. To further improve efficiency, an acceleration solution is developed to reduce the computational cost of CSLP by propagating membership degrees from potential centers to samples, making CSLP scalable for large-scale tasks. An iterative strategy is designed to solve the formulated objective function. Extensive experiments demonstrate that CSLP outperforms other fuzzy clustering methods in terms of both effectiveness and scalability.\",\"PeriodicalId\":13212,\"journal\":{\"name\":\"IEEE Transactions on Fuzzy Systems\",\"volume\":\"33 9\",\"pages\":\"3047-3060\"},\"PeriodicalIF\":11.9000,\"publicationDate\":\"2025-09-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Fuzzy Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11150476/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Fuzzy Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11150476/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

为了将样本划分为不同的聚类，模糊c均值（Fuzzy C-Means， FCM）计算样本与聚类中心的隶属度，并提供软标签，近年来受到广泛关注。然而，现有的FCM方法遇到了以下挑战。首先，传统的FCM侧重于学习隶属度，而忽略了数据相似结构。其次，基于图的FCM通常将图的构建与聚类分离，忽略了图与聚类之间的知识交互，从而获得次优性能。第三，对于大规模任务，探索所有样本之间的相似结构在计算上是昂贵的。为了解决这些问题，我们提出了一种具有协同结构学习和保存（CSLP）的可扩展模糊聚类，它同时利用聚类信息和相似结构来学习最优的隶属度表示。具体来说，设计了一种自加权的方式来衡量样本的重要性，从而减少了异常值的不利影响。此外，根据隶属度表示中的数据相似度更新图，使得CSLP以一种相互增强的方式协同学习图和隶属度。因此，在聚类过程中充分挖掘了相似结构，并将其保存在学习到的隶属度中，增强了聚类标签的辨别能力。为了进一步提高效率，开发了一种加速解决方案，通过将隶属度从潜在中心传播到样本来降低CSLP的计算成本，使CSLP可扩展到大规模任务。设计了一种迭代策略来求解拟定的目标函数。大量的实验表明，CSLP在有效性和可扩展性方面都优于其他模糊聚类方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scalable Fuzzy Clustering With Collaborative Structure Learning and Preservation

To partition samples into distinct clusters, Fuzzy C-Means (FCM) calculates the membership degrees of samples to cluster centers and provides soft labels, gaining significant attention in recent years. However, existing FCM methods encounter the following challenges. First, traditional FCM focuses on learning membership degrees, neglecting the data similarity structures. Second, graph-based FCM typically separates graph construction from clustering, overlooking the knowledge interaction between graphs and clustering, obtaining suboptimal performance. Third, exploring the similarity structures among all samples is computationally expensive for large-scale tasks. To solve these dilemmas, we propose a scalable fuzzy clustering with collaborative structure learning and preservation (CSLP), which simultaneously leverages both cluster information and similarity structures to learn an optimal membership degree representation. Specifically, a self-weighted manner is devised to measure the sample importance, thereby reducing the adverse impacts of outliers. Moreover, the graph is updated according to the data similarities in the membership degree representation, such that CSLP collaboratively learns the graph and membership degrees in a mutually reinforcing manner. Thus, the similarity structures are fully explored during clustering processes and preserved in the learned membership degrees, enhancing the discrimination of clustering labels. To further improve efficiency, an acceleration solution is developed to reduce the computational cost of CSLP by propagating membership degrees from potential centers to samples, making CSLP scalable for large-scale tasks. An iterative strategy is designed to solve the formulated objective function. Extensive experiments demonstrate that CSLP outperforms other fuzzy clustering methods in terms of both effectiveness and scalability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Fuzzy Systems 工程技术-工程：电子与电气

CiteScore

20.50

自引率

13.40%

发文量

517

审稿时长

3.0 months

期刊介绍： The IEEE Transactions on Fuzzy Systems is a scholarly journal that focuses on the theory, design, and application of fuzzy systems. It aims to publish high-quality technical papers that contribute significant technical knowledge and exploratory developments in the field of fuzzy systems. The journal particularly emphasizes engineering systems and scientific applications. In addition to research articles, the Transactions also includes a letters section featuring current information, comments, and rebuttals related to published papers.