COCLEP: Contrastive Learning-based Semi-Supervised Community Search

Ling Li, Siqiang Luo, Yuhai Zhao, Caihua Shan, Zhengkui Wang, Lu Qin
{"title":"COCLEP: Contrastive Learning-based Semi-Supervised Community Search","authors":"Ling Li, Siqiang Luo, Yuhai Zhao, Caihua Shan, Zhengkui Wang, Lu Qin","doi":"10.1109/ICDE55515.2023.00191","DOIUrl":null,"url":null,"abstract":"Community search is a fundamental graph processing task that aims to find a community containing the given query node. Recent studies show that machine learning (ML)-based community search can return higher-quality communities than the classic methods such as k-core and k-truss. However, the state-of-the-art ML-based models require a large number of labeled data (i.e., nodes in ground-truth communities) for training that are difficult to obtain in real applications, and incur unaffordable memory costs or query time for large datasets. To address these issues, in this paper, we present the community search based on contrastive learning with partition, namely COCLEP, which only requires a few labels and is both memory and query efficient. In particular, given a small collection of query nodes and a few (e.g., three) corresponding ground-truth community nodes for each query, COCLEP learns a query-dependent model through the proposed graph neural network and the designed label-aware contrastive learner. The former perceives query node information, low-order neighborhood information, and high-order hypergraph structure information, the latter contrasts low-order intra-view, high-order intra-view, and low-high-order inter-view representations of the nodes. Further, we theoretically prove that COCLEP can be scalable to large datasets with the min-cut over the graph. To the best of our knowledge, this is the first attempt to adopt contrastive learning for community search task that is nontrivial. Extensive experiments on real-world datasets show that COCLEP simultaneously achieves better community effectiveness and comparably high query efficiency while using fewer labels compared with the-state-of-the-art approaches and is scalable for large datasets.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Community search is a fundamental graph processing task that aims to find a community containing the given query node. Recent studies show that machine learning (ML)-based community search can return higher-quality communities than the classic methods such as k-core and k-truss. However, the state-of-the-art ML-based models require a large number of labeled data (i.e., nodes in ground-truth communities) for training that are difficult to obtain in real applications, and incur unaffordable memory costs or query time for large datasets. To address these issues, in this paper, we present the community search based on contrastive learning with partition, namely COCLEP, which only requires a few labels and is both memory and query efficient. In particular, given a small collection of query nodes and a few (e.g., three) corresponding ground-truth community nodes for each query, COCLEP learns a query-dependent model through the proposed graph neural network and the designed label-aware contrastive learner. The former perceives query node information, low-order neighborhood information, and high-order hypergraph structure information, the latter contrasts low-order intra-view, high-order intra-view, and low-high-order inter-view representations of the nodes. Further, we theoretically prove that COCLEP can be scalable to large datasets with the min-cut over the graph. To the best of our knowledge, this is the first attempt to adopt contrastive learning for community search task that is nontrivial. Extensive experiments on real-world datasets show that COCLEP simultaneously achieves better community effectiveness and comparably high query efficiency while using fewer labels compared with the-state-of-the-art approaches and is scalable for large datasets.
基于对比学习的半监督社区搜索
社区搜索是一项基本的图处理任务,旨在找到包含给定查询节点的社区。最近的研究表明,基于机器学习(ML)的社区搜索比k-core和k-truss等经典方法可以返回更高质量的社区。然而,最先进的基于ml的模型需要大量标记数据(即ground-truth社区中的节点)进行训练,而这些数据在实际应用中很难获得,并且对于大型数据集会产生难以承受的内存成本或查询时间。为了解决这些问题,本文提出了一种基于分区对比学习的社区搜索方法,即COCLEP,它只需要很少的标签,并且具有内存和查询效率。特别是,给定一个小的查询节点集合和每个查询的几个(例如三个)相应的基真社区节点,COCLEP通过提出的图神经网络和设计的标签感知对比学习器学习查询依赖模型。前者感知查询节点信息、低阶邻域信息和高阶超图结构信息,后者对比节点的低阶视图内、高阶视图内和低-高阶视图间表示。此外,我们从理论上证明了COCLEP可以扩展到具有图上最小切割的大型数据集。据我们所知,这是第一次尝试将对比学习应用于非平凡的社区搜索任务。在实际数据集上进行的大量实验表明,与最先进的方法相比,COCLEP使用更少的标签同时实现了更好的社区有效性和相对较高的查询效率,并且对于大型数据集具有可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信