Weijia Lu , Min Wang , Yun Yu , Liang Ma , Yaxiang Shi , Zhongqiu Huang , Ming Gong
{"title":"A novel self-supervised graph clustering method with reliable semi-supervision","authors":"Weijia Lu , Min Wang , Yun Yu , Liang Ma , Yaxiang Shi , Zhongqiu Huang , Ming Gong","doi":"10.1016/j.neunet.2025.107418","DOIUrl":null,"url":null,"abstract":"<div><div>Cluster analysis, as a core technique in unsupervised learning, has widespread applications. With the increasing complexity of data, deep clustering, which integrates the advantages of deep learning and traditional clustering algorithms, demonstrates outstanding performance in processing high-dimensional and complex data. However, when applied to graph data, deep clustering faces two major challenges: noise and sparsity. Noise introduces misleading connections, while sparsity makes it difficult to accurately capture relationships between nodes. These two issues not only increase the difficulty of feature extraction but also significantly affect clustering performance. To address these problems, we propose a novel Self-Supervised Graph Clustering model based on Reliable Semi-Supervision (SSGC-RSS). This model innovates through upstream and downstream components. The upstream component employs a dual-decoder graph autoencoder with joint clustering optimization, preserving latent information of features and graph structure, and alleviates the sparsity problem by generating cluster centers and pseudo-labels. The downstream component utilizes a semi-supervised graph attention encoding network based on highly reliable samples and their pseudo-labels to select reliable samples for training, thereby effectively reducing the interference of noise. Experimental results on multiple graph datasets demonstrate that, compared to existing methods, SSGC-RSS achieves significant performance improvements, with accuracy improvements of 0.9%, 2.0%, and 5.6% on Cora, Citeseer, and Pubmed datasets respectively, proving its effectiveness and superiority in complex graph data clustering tasks.</div></div>","PeriodicalId":49763,"journal":{"name":"Neural Networks","volume":"187 ","pages":"Article 107418"},"PeriodicalIF":6.0000,"publicationDate":"2025-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neural Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0893608025002977","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Cluster analysis, as a core technique in unsupervised learning, has widespread applications. With the increasing complexity of data, deep clustering, which integrates the advantages of deep learning and traditional clustering algorithms, demonstrates outstanding performance in processing high-dimensional and complex data. However, when applied to graph data, deep clustering faces two major challenges: noise and sparsity. Noise introduces misleading connections, while sparsity makes it difficult to accurately capture relationships between nodes. These two issues not only increase the difficulty of feature extraction but also significantly affect clustering performance. To address these problems, we propose a novel Self-Supervised Graph Clustering model based on Reliable Semi-Supervision (SSGC-RSS). This model innovates through upstream and downstream components. The upstream component employs a dual-decoder graph autoencoder with joint clustering optimization, preserving latent information of features and graph structure, and alleviates the sparsity problem by generating cluster centers and pseudo-labels. The downstream component utilizes a semi-supervised graph attention encoding network based on highly reliable samples and their pseudo-labels to select reliable samples for training, thereby effectively reducing the interference of noise. Experimental results on multiple graph datasets demonstrate that, compared to existing methods, SSGC-RSS achieves significant performance improvements, with accuracy improvements of 0.9%, 2.0%, and 5.6% on Cora, Citeseer, and Pubmed datasets respectively, proving its effectiveness and superiority in complex graph data clustering tasks.
期刊介绍:
Neural Networks is a platform that aims to foster an international community of scholars and practitioners interested in neural networks, deep learning, and other approaches to artificial intelligence and machine learning. Our journal invites submissions covering various aspects of neural networks research, from computational neuroscience and cognitive modeling to mathematical analyses and engineering applications. By providing a forum for interdisciplinary discussions between biology and technology, we aim to encourage the development of biologically-inspired artificial intelligence.