通过对比硬采样进行无监督节点聚类

arXiv - CS - Social and Information Networks Pub Date : 2024-09-12 DOI:arxiv-2409.07718

Hang Cui, Tarek Abdelzaher

{"title":"通过对比硬采样进行无监督节点聚类","authors":"Hang Cui, Tarek Abdelzaher","doi":"arxiv-2409.07718","DOIUrl":null,"url":null,"abstract":"This paper introduces a fine-grained contrastive learning scheme for\nunsupervised node clustering. Previous clustering methods only focus on a small\nfeature set (class-dependent features), which demonstrates explicit clustering\ncharacteristics, ignoring the rest of the feature spaces (class-invariant\nfeatures). This paper exploits class-invariant features via graph contrastive\nlearning to discover additional high-quality features for unsupervised\nclustering. We formulate a novel node-level fine-grained augmentation framework\nfor self-supervised learning, which iteratively identifies competitive\ncontrastive samples from the whole feature spaces, in the form of positive and\nnegative examples of node relations. While positive examples of node relations\nare usually expressed as edges in graph homophily, negative examples are\nimplicit without a direct edge. We show, however, that simply sampling nodes\nbeyond the local neighborhood results in less competitive negative pairs, that\nare less effective for contrastive learning. Inspired by counterfactual\naugmentation, we instead sample competitive negative node relations by creating\nvirtual nodes that inherit (in a self-supervised fashion) class-invariant\nfeatures, while altering class-dependent features, creating contrasting pairs\nthat lie closer to the boundary and offering better contrast. Consequently, our\nexperiments demonstrate significant improvements in supervised node clustering\ntasks on six baselines and six real-world social network datasets.","PeriodicalId":501032,"journal":{"name":"arXiv - CS - Social and Information Networks","volume":"21 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Unsupervised node clustering via contrastive hard sampling\",\"authors\":\"Hang Cui, Tarek Abdelzaher\",\"doi\":\"arxiv-2409.07718\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper introduces a fine-grained contrastive learning scheme for\\nunsupervised node clustering. Previous clustering methods only focus on a small\\nfeature set (class-dependent features), which demonstrates explicit clustering\\ncharacteristics, ignoring the rest of the feature spaces (class-invariant\\nfeatures). This paper exploits class-invariant features via graph contrastive\\nlearning to discover additional high-quality features for unsupervised\\nclustering. We formulate a novel node-level fine-grained augmentation framework\\nfor self-supervised learning, which iteratively identifies competitive\\ncontrastive samples from the whole feature spaces, in the form of positive and\\nnegative examples of node relations. While positive examples of node relations\\nare usually expressed as edges in graph homophily, negative examples are\\nimplicit without a direct edge. We show, however, that simply sampling nodes\\nbeyond the local neighborhood results in less competitive negative pairs, that\\nare less effective for contrastive learning. Inspired by counterfactual\\naugmentation, we instead sample competitive negative node relations by creating\\nvirtual nodes that inherit (in a self-supervised fashion) class-invariant\\nfeatures, while altering class-dependent features, creating contrasting pairs\\nthat lie closer to the boundary and offering better contrast. Consequently, our\\nexperiments demonstrate significant improvements in supervised node clustering\\ntasks on six baselines and six real-world social network datasets.\",\"PeriodicalId\":501032,\"journal\":{\"name\":\"arXiv - CS - Social and Information Networks\",\"volume\":\"21 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Social and Information Networks\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.07718\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Social and Information Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.07718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文介绍了一种用于无监督节点聚类的细粒度对比学习方案。以往的聚类方法只关注一小部分特征集（与类相关的特征），这些特征集展示了明确的聚类特征，而忽略了特征空间的其他部分（类不变特征）。本文通过图对比学习（graph contrastivelearning）利用类不变特征，为无监督聚类发现额外的高质量特征。我们为自监督学习制定了一个新颖的节点级细粒度增强框架，该框架以节点关系正例和负例的形式，从整个特征空间中迭代识别有竞争力的对比样本。节点关系的正例通常表现为图同源性中的边，而负例则没有直接的边。然而，我们发现，简单地对本地邻域以外的节点进行采样会导致竞争性较弱的负对，从而降低对比学习的效果。受到反事实增强的启发，我们转而通过创建虚拟节点来采样有竞争力的负面节点关系，这些虚拟节点（以自我监督的方式）继承了类不变特征，同时改变了依赖于类的特征，从而创建了更接近边界的对比对，并提供了更好的对比。因此，我们的实验证明，在六个基线和六个真实世界社交网络数据集上，监督节点聚类任务有了显著改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Unsupervised node clustering via contrastive hard sampling

This paper introduces a fine-grained contrastive learning scheme for unsupervised node clustering. Previous clustering methods only focus on a small feature set (class-dependent features), which demonstrates explicit clustering characteristics, ignoring the rest of the feature spaces (class-invariant features). This paper exploits class-invariant features via graph contrastive learning to discover additional high-quality features for unsupervised clustering. We formulate a novel node-level fine-grained augmentation framework for self-supervised learning, which iteratively identifies competitive contrastive samples from the whole feature spaces, in the form of positive and negative examples of node relations. While positive examples of node relations are usually expressed as edges in graph homophily, negative examples are implicit without a direct edge. We show, however, that simply sampling nodes beyond the local neighborhood results in less competitive negative pairs, that are less effective for contrastive learning. Inspired by counterfactual augmentation, we instead sample competitive negative node relations by creating virtual nodes that inherit (in a self-supervised fashion) class-invariant features, while altering class-dependent features, creating contrasting pairs that lie closer to the boundary and offering better contrast. Consequently, our experiments demonstrate significant improvements in supervised node clustering tasks on six baselines and six real-world social network datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Social and Information Networks

自引率

0.00%

发文量