Xunlian Wu, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun
{"title":"Motif-based Contrastive Graph Clustering with clustering-oriented prompt","authors":"Xunlian Wu, Jingqi Hu, Yining Quan, Qiguang Miao, Peng Gang Sun","doi":"10.1016/j.ipm.2025.104208","DOIUrl":null,"url":null,"abstract":"<div><div>Graph contrastive learning has shown significant promise in graph clustering, yet prevalent approaches face two limitations: (1) most existing methods primarily capture lower-order adjacency structures, overlooking high-order motifs that are essential building blocks of the network; (2) most of them do not address false-negative pairs and lack cluster-oriented guidance, potentially embedding irrelevant information in the node representations. To overcome these issues, we introduce a novel Motif-based Contrastive Graph Clustering approach with Clustering-Oriented Prompt (MCGC). Firstly, MCGC employs a specialized Siamese encoder network to obtain both lower-order and higher-order node embeddings. The encoder processes two views of the graph: one based on lower-order adjacency and the other on higher-order motif structures, where higher-order motif (such as triangles) is extracted using motif adjacency matrices. Then, structural contrastive learning is used to ensure cross-view structural consistency. Furthermore, node-level contrastive learning is designed to enhance the discriminative capability of node embeddings, while interactions between samples and centroids provide clustering-oriented prompts. Finally, a parameter-shared MLP aligns embeddings in a unified clustering space, refined by cluster-level contrastive learning. These contrastive learning strategy ensures better-defined cluster boundaries and improves the quality of node representations. The approach is versatile and can be applied in recommendation systems, where clustering similar users enhances personalized recommendations, and in anomaly detection, where it helps identify unusual patterns or outliers in transaction or social networks. Experimental results on six datasets demonstrate that MCGC outperforms state-of-the-art algorithms. For example, on the EAT dataset, MCGC achieves 58.68% in ACC, surpassing the runner-up (CCGC) by 4.71%, demonstrating the effectiveness of motif-based contrastive learning in improving clustering quality. The source code is available at: <span><span>https://github.com/CSLab208/MCGC-Motif-based</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104208"},"PeriodicalIF":7.4000,"publicationDate":"2025-05-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325001499","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Graph contrastive learning has shown significant promise in graph clustering, yet prevalent approaches face two limitations: (1) most existing methods primarily capture lower-order adjacency structures, overlooking high-order motifs that are essential building blocks of the network; (2) most of them do not address false-negative pairs and lack cluster-oriented guidance, potentially embedding irrelevant information in the node representations. To overcome these issues, we introduce a novel Motif-based Contrastive Graph Clustering approach with Clustering-Oriented Prompt (MCGC). Firstly, MCGC employs a specialized Siamese encoder network to obtain both lower-order and higher-order node embeddings. The encoder processes two views of the graph: one based on lower-order adjacency and the other on higher-order motif structures, where higher-order motif (such as triangles) is extracted using motif adjacency matrices. Then, structural contrastive learning is used to ensure cross-view structural consistency. Furthermore, node-level contrastive learning is designed to enhance the discriminative capability of node embeddings, while interactions between samples and centroids provide clustering-oriented prompts. Finally, a parameter-shared MLP aligns embeddings in a unified clustering space, refined by cluster-level contrastive learning. These contrastive learning strategy ensures better-defined cluster boundaries and improves the quality of node representations. The approach is versatile and can be applied in recommendation systems, where clustering similar users enhances personalized recommendations, and in anomaly detection, where it helps identify unusual patterns or outliers in transaction or social networks. Experimental results on six datasets demonstrate that MCGC outperforms state-of-the-art algorithms. For example, on the EAT dataset, MCGC achieves 58.68% in ACC, surpassing the runner-up (CCGC) by 4.71%, demonstrating the effectiveness of motif-based contrastive learning in improving clustering quality. The source code is available at: https://github.com/CSLab208/MCGC-Motif-based.
期刊介绍:
Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing.
We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.