Fangfang Li, Quanxue Gao, Xiaoke Ma, Ming Yang, Cheng Deng
{"title":"Self-Supervised Graph Embedding Clustering.","authors":"Fangfang Li, Quanxue Gao, Xiaoke Ma, Ming Yang, Cheng Deng","doi":"10.1109/TPAMI.2025.3599185","DOIUrl":null,"url":null,"abstract":"<p><p>Manifold learning and $K$-means are two powerful techniques for data analysis in the field of artificial intelligence. When used for label learning, a promising strategy is to combine them directly and optimize both models simultaneously. However, a significant drawback of this approach is that it represents a naive and crude integration, requiring the optimization of all variables in both models without achieving a truly essential combination. Additionally, it introduces an extra hyperparameter and cannot ensure cluster balance. These challenges motivate us to explore whether a meaningful integration can be developed for dimensionality reduction clustering. In this paper, we propose a novel self-supervised manifold clustering framework that reformulates the two models into a unified framework, eliminating the need for additional hyperparameters while achieving dimensionality reduction clustering. Specifically, by analyzing the relationship between $K$-means and manifold learning, we construct a meaningful low-dimensional manifold clustering model that directly produces the label matrix of the data. The label information is then used to guide the learning of the manifold structure, ensuring consistency between the manifold structure and the labels. Notably, we identify a valuable role of ${\\ell _{2,p}}$-norm regularization in clustering: maximizing the ${\\ell _{2,p}}$-norm naturally maintains class balance during clustering, and we provide a theoretical proof of this property. Extensive experimental results demonstrate the efficiency of our proposed model.</p>","PeriodicalId":94034,"journal":{"name":"IEEE transactions on pattern analysis and machine intelligence","volume":"PP ","pages":""},"PeriodicalIF":18.6000,"publicationDate":"2025-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on pattern analysis and machine intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TPAMI.2025.3599185","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Manifold learning and $K$-means are two powerful techniques for data analysis in the field of artificial intelligence. When used for label learning, a promising strategy is to combine them directly and optimize both models simultaneously. However, a significant drawback of this approach is that it represents a naive and crude integration, requiring the optimization of all variables in both models without achieving a truly essential combination. Additionally, it introduces an extra hyperparameter and cannot ensure cluster balance. These challenges motivate us to explore whether a meaningful integration can be developed for dimensionality reduction clustering. In this paper, we propose a novel self-supervised manifold clustering framework that reformulates the two models into a unified framework, eliminating the need for additional hyperparameters while achieving dimensionality reduction clustering. Specifically, by analyzing the relationship between $K$-means and manifold learning, we construct a meaningful low-dimensional manifold clustering model that directly produces the label matrix of the data. The label information is then used to guide the learning of the manifold structure, ensuring consistency between the manifold structure and the labels. Notably, we identify a valuable role of ${\ell _{2,p}}$-norm regularization in clustering: maximizing the ${\ell _{2,p}}$-norm naturally maintains class balance during clustering, and we provide a theoretical proof of this property. Extensive experimental results demonstrate the efficiency of our proposed model.
流形学习和K -means是人工智能领域中两种强大的数据分析技术。当用于标签学习时,一种很有前途的策略是直接将它们结合起来,同时优化两个模型。然而,这种方法的一个显著缺点是,它代表了一种幼稚和粗糙的集成,需要优化两个模型中的所有变量,而没有实现真正必要的组合。此外,它引入了一个额外的超参数,不能确保集群平衡。这些挑战促使我们探索是否可以为降维聚类开发有意义的集成。在本文中,我们提出了一种新的自监督流形聚类框架,将这两个模型重新表述为一个统一的框架,在实现降维聚类的同时消除了对额外超参数的需要。具体来说,通过分析K均值与流形学习之间的关系,我们构建了一个有意义的低维流形聚类模型,该模型直接产生数据的标签矩阵。然后使用标签信息来指导流形结构的学习,确保流形结构与标签的一致性。值得注意的是,我们确定了${\ well _{2,p}}$-norm正则化在聚类中的重要作用:最大化${\ well _{2,p}}$-norm自然地保持了聚类过程中的类平衡,并且我们提供了这一性质的理论证明。大量的实验结果证明了该模型的有效性。