{"title":"Node Clustering on Attributed Graph Using Anchor Sampling Strategy and Debiasing Strategy","authors":"Qian Tang;Yiji Zhao;Hao Wu;Lei Zhang","doi":"10.1109/TETCI.2024.3369849","DOIUrl":null,"url":null,"abstract":"Contrastive representation learning has been widely employed in attributed graph clustering and has demonstrated significant success. However, these methods have two problems: 1)According to an assumption that clusters are formed around a minority of central anchor nodes, the contrastive relationships between these anchors are not explored in previous works. 2)They fail to deal with biased sample pairs, which may degrade the representation quality and cause poor clustering performance. To solve the problems, we propose a framework termed GE-S-D for both node representation learning and clustering, which consists of an anchor sampling strategy, a low-pass graph encoder, and a debiasing strategy. Specifically, to reveal the contrastive relationships between anchors, we design a sampling strategy to select a small number of anchors and then construct a training set of positive and negative sample pairs for contrastive learning. Then, we introduce a low-pass graph encoder to propagate contrastive messages to all nodes and learn cluster-friendly node representations. Furthermore, to alleviate the interference of biased sample pairs, we design a debiasing strategy using K-Means on the node representations to obtain the clustering information and remove the false positive and false negative sample pairs in the training set for improving contrastive learning. The clustering performance is verified on five benchmark datasets, and our method is superior to many state-of-the-art methods according to quantitive and qualitative analysis.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"8 4","pages":"3017-3028"},"PeriodicalIF":5.3000,"publicationDate":"2024-03-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10463188/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Contrastive representation learning has been widely employed in attributed graph clustering and has demonstrated significant success. However, these methods have two problems: 1)According to an assumption that clusters are formed around a minority of central anchor nodes, the contrastive relationships between these anchors are not explored in previous works. 2)They fail to deal with biased sample pairs, which may degrade the representation quality and cause poor clustering performance. To solve the problems, we propose a framework termed GE-S-D for both node representation learning and clustering, which consists of an anchor sampling strategy, a low-pass graph encoder, and a debiasing strategy. Specifically, to reveal the contrastive relationships between anchors, we design a sampling strategy to select a small number of anchors and then construct a training set of positive and negative sample pairs for contrastive learning. Then, we introduce a low-pass graph encoder to propagate contrastive messages to all nodes and learn cluster-friendly node representations. Furthermore, to alleviate the interference of biased sample pairs, we design a debiasing strategy using K-Means on the node representations to obtain the clustering information and remove the false positive and false negative sample pairs in the training set for improving contrastive learning. The clustering performance is verified on five benchmark datasets, and our method is superior to many state-of-the-art methods according to quantitive and qualitative analysis.
期刊介绍:
The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys.
TETCI is an electronics only publication. TETCI publishes six issues per year.
Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.