{"title":"COCLEP: Contrastive Learning-based Semi-Supervised Community Search","authors":"Ling Li, Siqiang Luo, Yuhai Zhao, Caihua Shan, Zhengkui Wang, Lu Qin","doi":"10.1109/ICDE55515.2023.00191","DOIUrl":null,"url":null,"abstract":"Community search is a fundamental graph processing task that aims to find a community containing the given query node. Recent studies show that machine learning (ML)-based community search can return higher-quality communities than the classic methods such as k-core and k-truss. However, the state-of-the-art ML-based models require a large number of labeled data (i.e., nodes in ground-truth communities) for training that are difficult to obtain in real applications, and incur unaffordable memory costs or query time for large datasets. To address these issues, in this paper, we present the community search based on contrastive learning with partition, namely COCLEP, which only requires a few labels and is both memory and query efficient. In particular, given a small collection of query nodes and a few (e.g., three) corresponding ground-truth community nodes for each query, COCLEP learns a query-dependent model through the proposed graph neural network and the designed label-aware contrastive learner. The former perceives query node information, low-order neighborhood information, and high-order hypergraph structure information, the latter contrasts low-order intra-view, high-order intra-view, and low-high-order inter-view representations of the nodes. Further, we theoretically prove that COCLEP can be scalable to large datasets with the min-cut over the graph. To the best of our knowledge, this is the first attempt to adopt contrastive learning for community search task that is nontrivial. Extensive experiments on real-world datasets show that COCLEP simultaneously achieves better community effectiveness and comparably high query efficiency while using fewer labels compared with the-state-of-the-art approaches and is scalable for large datasets.","PeriodicalId":434744,"journal":{"name":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 39th International Conference on Data Engineering (ICDE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE55515.2023.00191","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Community search is a fundamental graph processing task that aims to find a community containing the given query node. Recent studies show that machine learning (ML)-based community search can return higher-quality communities than the classic methods such as k-core and k-truss. However, the state-of-the-art ML-based models require a large number of labeled data (i.e., nodes in ground-truth communities) for training that are difficult to obtain in real applications, and incur unaffordable memory costs or query time for large datasets. To address these issues, in this paper, we present the community search based on contrastive learning with partition, namely COCLEP, which only requires a few labels and is both memory and query efficient. In particular, given a small collection of query nodes and a few (e.g., three) corresponding ground-truth community nodes for each query, COCLEP learns a query-dependent model through the proposed graph neural network and the designed label-aware contrastive learner. The former perceives query node information, low-order neighborhood information, and high-order hypergraph structure information, the latter contrasts low-order intra-view, high-order intra-view, and low-high-order inter-view representations of the nodes. Further, we theoretically prove that COCLEP can be scalable to large datasets with the min-cut over the graph. To the best of our knowledge, this is the first attempt to adopt contrastive learning for community search task that is nontrivial. Extensive experiments on real-world datasets show that COCLEP simultaneously achieves better community effectiveness and comparably high query efficiency while using fewer labels compared with the-state-of-the-art approaches and is scalable for large datasets.