{"title":"通过对比硬采样进行无监督节点聚类","authors":"Hang Cui, Tarek Abdelzaher","doi":"arxiv-2409.07718","DOIUrl":null,"url":null,"abstract":"This paper introduces a fine-grained contrastive learning scheme for\nunsupervised node clustering. Previous clustering methods only focus on a small\nfeature set (class-dependent features), which demonstrates explicit clustering\ncharacteristics, ignoring the rest of the feature spaces (class-invariant\nfeatures). This paper exploits class-invariant features via graph contrastive\nlearning to discover additional high-quality features for unsupervised\nclustering. We formulate a novel node-level fine-grained augmentation framework\nfor self-supervised learning, which iteratively identifies competitive\ncontrastive samples from the whole feature spaces, in the form of positive and\nnegative examples of node relations. While positive examples of node relations\nare usually expressed as edges in graph homophily, negative examples are\nimplicit without a direct edge. We show, however, that simply sampling nodes\nbeyond the local neighborhood results in less competitive negative pairs, that\nare less effective for contrastive learning. Inspired by counterfactual\naugmentation, we instead sample competitive negative node relations by creating\nvirtual nodes that inherit (in a self-supervised fashion) class-invariant\nfeatures, while altering class-dependent features, creating contrasting pairs\nthat lie closer to the boundary and offering better contrast. Consequently, our\nexperiments demonstrate significant improvements in supervised node clustering\ntasks on six baselines and six real-world social network datasets.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised node clustering via contrastive hard sampling\",\"authors\":\"Hang Cui, Tarek Abdelzaher\",\"doi\":\"arxiv-2409.07718\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces a fine-grained contrastive learning scheme for\\nunsupervised node clustering. Previous clustering methods only focus on a small\\nfeature set (class-dependent features), which demonstrates explicit clustering\\ncharacteristics, ignoring the rest of the feature spaces (class-invariant\\nfeatures). This paper exploits class-invariant features via graph contrastive\\nlearning to discover additional high-quality features for unsupervised\\nclustering. We formulate a novel node-level fine-grained augmentation framework\\nfor self-supervised learning, which iteratively identifies competitive\\ncontrastive samples from the whole feature spaces, in the form of positive and\\nnegative examples of node relations. While positive examples of node relations\\nare usually expressed as edges in graph homophily, negative examples are\\nimplicit without a direct edge. We show, however, that simply sampling nodes\\nbeyond the local neighborhood results in less competitive negative pairs, that\\nare less effective for contrastive learning. Inspired by counterfactual\\naugmentation, we instead sample competitive negative node relations by creating\\nvirtual nodes that inherit (in a self-supervised fashion) class-invariant\\nfeatures, while altering class-dependent features, creating contrasting pairs\\nthat lie closer to the boundary and offering better contrast. Consequently, our\\nexperiments demonstrate significant improvements in supervised node clustering\\ntasks on six baselines and six real-world social network datasets.\",\"PeriodicalId\":501032,\"journal\":{\"name\":\"arXiv - CS - Social and Information Networks\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Social and Information Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07718\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unsupervised node clustering via contrastive hard sampling
This paper introduces a fine-grained contrastive learning scheme for
unsupervised node clustering. Previous clustering methods only focus on a small
feature set (class-dependent features), which demonstrates explicit clustering
characteristics, ignoring the rest of the feature spaces (class-invariant
features). This paper exploits class-invariant features via graph contrastive
learning to discover additional high-quality features for unsupervised
clustering. We formulate a novel node-level fine-grained augmentation framework
for self-supervised learning, which iteratively identifies competitive
contrastive samples from the whole feature spaces, in the form of positive and
negative examples of node relations. While positive examples of node relations
are usually expressed as edges in graph homophily, negative examples are
implicit without a direct edge. We show, however, that simply sampling nodes
beyond the local neighborhood results in less competitive negative pairs, that
are less effective for contrastive learning. Inspired by counterfactual
augmentation, we instead sample competitive negative node relations by creating
virtual nodes that inherit (in a self-supervised fashion) class-invariant
features, while altering class-dependent features, creating contrasting pairs
that lie closer to the boundary and offering better contrast. Consequently, our
experiments demonstrate significant improvements in supervised node clustering
tasks on six baselines and six real-world social network datasets.