IEEE Transactions on Knowledge and Data Engineering最新文献

筛选
英文 中文
Imbalanced Node Classification With Synthetic Over-Sampling 利用合成过度采样进行不平衡节点分类
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-14 DOI: 10.1109/TKDE.2024.3443160
Tianxiang Zhao;Xiang Zhang;Suhang Wang
{"title":"Imbalanced Node Classification With Synthetic Over-Sampling","authors":"Tianxiang Zhao;Xiang Zhang;Suhang Wang","doi":"10.1109/TKDE.2024.3443160","DOIUrl":"10.1109/TKDE.2024.3443160","url":null,"abstract":"In recent years, graph neural networks (GNNs) have achieved state-of-the-art performance for node classification. However, most existing GNNs would suffer from the graph imbalance problem. In many real-world scenarios, node classes are imbalanced, with some majority classes making up most parts of the graph. The message propagation mechanism in GNNs would further amplify the dominance of those majority classes, resulting in sub-optimal classification performance. In this work, we seek to address this problem by generating pseudo instances of minority classes to balance the training data, extending previous over-sampling-based techniques. This task is non-trivial, as those techniques are designed with the assumption that instances are independent. Neglection of relation information would complicate this oversampling process. Furthermore, the node classification task typically takes the semi-supervised setting with only a few labeled nodes, providing insufficient supervision for the generation of minority instances. Generated new nodes of low quality would harm the trained classifier. In this work, we address these difficulties by synthesizing new nodes in a constructed embedding space, which encodes both node attributes and topology information. Furthermore, an edge generator is trained simultaneously to model the graph structure and provide relations for new samples. To further improve the data efficiency, we also explore synthesizing mixed “in-between” nodes to utilize nodes from the majority class in this over-sampling process. Experiments on real-world datasets validate the effectiveness of our proposed framework.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8515-8528"},"PeriodicalIF":8.9,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An Balanced, and Scalable Graph-Based Multiview Clustering Method 基于图形的平衡且可扩展的多视图聚类方法
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-14 DOI: 10.1109/TKDE.2024.3443534
Zihua Zhao;Feiping Nie;Rong Wang;Zheng Wang;Xuelong Li
{"title":"An Balanced, and Scalable Graph-Based Multiview Clustering Method","authors":"Zihua Zhao;Feiping Nie;Rong Wang;Zheng Wang;Xuelong Li","doi":"10.1109/TKDE.2024.3443534","DOIUrl":"10.1109/TKDE.2024.3443534","url":null,"abstract":"In recent years, graph-based multiview clustering methods have become a research hotspot in the clustering field. However, most existing methods lack consideration of cluster balance in their results. In fact, cluster balance is crucial in many real-world scenarios. Additionally, graph-based multiview clustering methods often suffer from high time consumption and cannot handle large-scale datasets. To address these issues, this paper proposes a novel graph-based multiview clustering method. The method is built upon the bipartite graph. Specifically, it employs a label propagation mechanism to update the smaller anchor label matrix rather than the sample label matrix, significantly reducing the computational cost. The introduced balance constraint in the proposed model contributes to achieving balanced clustering results. The entire clustering model combines information from multiple views through graph fusion. The joint graph and view weight parameters in the model are obtained through task-driven self-supervised learning. Moreover, the model can directly obtain clustering results without the need for the two-stage processing typically used in general spectral clustering. Finally, extensive experiments on toy datasets and real-world datasets are conducted to validate the superiority of the proposed method in terms of clustering performance, clustering balance, and time expenditure.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7643-7656"},"PeriodicalIF":8.9,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Wide to Deep: Dimension Lifting Network for Parameter-Efficient Knowledge Graph Embedding 从广度到深度:用于参数高效知识图谱嵌入的维度提升网络
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-14 DOI: 10.1109/TKDE.2024.3437479
Borui Cai;Yong Xiang;Longxiang Gao;Di Wu;He Zhang;Jiong Jin;Tom Luan
{"title":"From Wide to Deep: Dimension Lifting Network for Parameter-Efficient Knowledge Graph Embedding","authors":"Borui Cai;Yong Xiang;Longxiang Gao;Di Wu;He Zhang;Jiong Jin;Tom Luan","doi":"10.1109/TKDE.2024.3437479","DOIUrl":"10.1109/TKDE.2024.3437479","url":null,"abstract":"Knowledge graph embedding (KGE) that maps entities and relations into vector representations is essential for downstream applications. Conventional KGE methods require high-dimensional representations to learn the complex structure of knowledge graph, but lead to oversized model parameters. Recent advances reduce parameters by low-dimensional entity representations, while developing techniques (e.g., knowledge distillation or reinvented representation forms) to compensate for reduced dimension. However, such operations introduce complicated computations and model designs that may not benefit large knowledge graphs. To seek a simple strategy to improve the parameter efficiency of conventional KGE models, we take inspiration from that deeper neural networks require exponentially fewer parameters to achieve expressiveness comparable to wider networks for compositional structures. We view all entity representations as a single-layer embedding network, and conventional KGE methods that adopt high-dimensional entity representations equal widening the embedding network to gain expressiveness. To achieve parameter efficiency, we instead propose a deeper embedding network for entity representations, i.e., a narrow entity embedding layer plus a multi-layer dimension lifting network (LiftNet). Experiments on three public datasets show that by integrating LiftNet, four conventional KGE methods with 16-dimensional representations achieve comparable link prediction accuracy as original models that adopt 512-dimensional representations, saving 68.4% to 96.9% parameters.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8341-8348"},"PeriodicalIF":8.9,"publicationDate":"2024-08-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220132","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RollStore: Hybrid Onchain-Offchain Data Indexing for Blockchain Applications RollStore:区块链应用的链上-链下混合数据索引
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-12 DOI: 10.1109/TKDE.2024.3436514
Qi Lin;Binbin Gu;Faisal Nawab
{"title":"RollStore: Hybrid Onchain-Offchain Data Indexing for Blockchain Applications","authors":"Qi Lin;Binbin Gu;Faisal Nawab","doi":"10.1109/TKDE.2024.3436514","DOIUrl":"10.1109/TKDE.2024.3436514","url":null,"abstract":"The interest in building blockchain Decentralized Applications (DApps) has been growing over the past few years. DApps are implemented as smart contracts which are programs that are maintained by a blockchain network. Building DApps, however, faces many challenges—most notably the performance and monetary overhead of writing to blockchain smart contracts. To overcome this challenge, many DApp developers have explored utilizing \u0000<italic>off-chain</i>\u0000 resources—nodes outside of the blockchain network—to offload part of the processing and storage. In this paper, we propose RollStore, a data indexing solution for hybrid onchain-offchain DApps. RollStore provides efficiency in terms of reduced cost and latency, as well as security in terms of tolerating Byzantine (i.e., malicious) off-chain nodes. RollStore achieves this by: (1) a three-stage commitment strategy where each stage represents a point in a performance-security trade-off—i.e., the first stage is fast but less secure while the last stage is slower but more secure. (2) utilizing zero-knowledge (zk) proofs to enable the on-chain smart contract to verify off-chain operations with a small cost. (3) Combining Log-Structured Merge (LSM) trees and Merkle Mountain Range (MMR) trees to efficiently enable both access and verification of indexed data. We experimentally evaluate the cost and performance benefits of RollStore while comparing with BlockchainDB and BigChainDB.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"9176-9191"},"PeriodicalIF":8.9,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prompt Tuning on Graph-Augmented Low-Resource Text Classification 图增强低资源文本分类的提示调整
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-12 DOI: 10.1109/TKDE.2024.3440068
Zhihao Wen;Yuan Fang
{"title":"Prompt Tuning on Graph-Augmented Low-Resource Text Classification","authors":"Zhihao Wen;Yuan Fang","doi":"10.1109/TKDE.2024.3440068","DOIUrl":"10.1109/TKDE.2024.3440068","url":null,"abstract":"Text classification is a fundamental problem in information retrieval with many real-world applications, such as predicting the topics of online articles and the categories of e-commerce product descriptions. However, low-resource text classification, with no or few labeled samples, presents a serious concern for supervised learning. Meanwhile, many text data are inherently grounded on a network structure, such as a hyperlink/citation network for online articles, and a user-item purchase network for e-commerce products. These graph structures capture rich semantic relationships, which can potentially augment low-resource text classification. In this paper, we propose a novel model called Graph-Grounded Pre-training and Prompting (G2P2) to address low-resource text classification in a two-pronged approach. During pre-training, we propose three graph interaction-based contrastive strategies to jointly pre-train a graph-text model; during downstream classification, we explore handcrafted discrete prompts and continuous prompt tuning for the jointly pre-trained model to achieve zero- and few-shot classification, respectively. Moreover, we explore the possibility of employing continuous prompt tuning for zero-shot inference. Specifically, we aim to generalize continuous prompts to unseen classes while leveraging a set of base classes. To this end, we extend G2P2 into G2P2\u0000<inline-formula><tex-math>$^*$</tex-math></inline-formula>\u0000, hinging on a new architecture of conditional prompt tuning. Extensive experiments on four real-world datasets demonstrate the strength of G2P2 in zero- and few-shot low-resource text classification tasks, and illustrate the advantage of G2P2\u0000<inline-formula><tex-math>$^*$</tex-math></inline-formula>\u0000 in dealing with unseen classes.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"9080-9095"},"PeriodicalIF":8.9,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Make Heterophilic Graphs Better Fit GNN: A Graph Rewiring Approach 让嗜异图更适合 GNN:图形重布线方法
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-12 DOI: 10.1109/TKDE.2024.3441766
Wendong Bi;Lun Du;Qiang Fu;Yanlin Wang;Shi Han;Dongmei Zhang
{"title":"Make Heterophilic Graphs Better Fit GNN: A Graph Rewiring Approach","authors":"Wendong Bi;Lun Du;Qiang Fu;Yanlin Wang;Shi Han;Dongmei Zhang","doi":"10.1109/TKDE.2024.3441766","DOIUrl":"10.1109/TKDE.2024.3441766","url":null,"abstract":"Graph Neural Networks (GNNs) have shown superior performance in modeling graph data. Existing studies have shown that a lot of GNNs perform well on homophilic graphs while performing poorly on heterophilic graphs. Recently, researchers have turned their attention to design GNNs for heterophilic graphs by specific model design. Different from existing methods that mitigate heterophily by model design, we propose to study heterophilic graphs from an orthogonal perspective by rewiring the graph to reduce heterophily and make GNNs perform better. Through comprehensive empirical analysis, we verify the potential of graph rewiring methods. Then we propose a method named \u0000<bold>D</b>\u0000eep \u0000<bold>H</b>\u0000eterophily \u0000<bold>G</b>\u0000raph \u0000<bold>R</b>\u0000ewiring (DHGR) to rewire graphs by adding homophilic edges and pruning heterophilic edges. The rewiring operation is implemented by comparing the similarity of neighborhood label/feature distribution of node pairs. Besides, we design a scalable implementation for DHGR to guarantee a high efficiency. DHRG can be easily used as a plug-in module, i.e., a graph pre-processing step, for any GNNs, including both GNNs for homophily and heterophily, to boost their performance on the node classification task. To the best of our knowledge, it is the first work studying graph rewiring for heterophilic graphs. Extensive experiments on 11 public graph datasets demonstrate the superiority of our proposed methods.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8744-8757"},"PeriodicalIF":8.9,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220133","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimization Techniques for Unsupervised Complex Table Reasoning via Self-Training Framework 通过自我训练框架实现无监督复杂表格推理的优化技术
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-12 DOI: 10.1109/TKDE.2024.3439405
Zhenyu Li;Xiuxing Li;Sunqi Fan;Jianyong Wang
{"title":"Optimization Techniques for Unsupervised Complex Table Reasoning via Self-Training Framework","authors":"Zhenyu Li;Xiuxing Li;Sunqi Fan;Jianyong Wang","doi":"10.1109/TKDE.2024.3439405","DOIUrl":"10.1109/TKDE.2024.3439405","url":null,"abstract":"Structured tabular data is a fundamental data type in numerous fields, and the capacity to reason over tables is crucial for answering questions and validating hypotheses. However, constructing labeled data for complex reasoning tasks is labor-intensive, and the quantity of annotated data remains insufficient to support the intricate demands of real-world applications. To address the insufficient annotation challenge, we present a self-training framework for unsupervised complex tabular reasoning (UCTR-ST) by generating diverse synthetic data with complex logic. Specifically, UCTR-ST incorporates several essential techniques: we aggregate diverse programs and execute them on tables based on a “Program-Management” component, and we bridge the gap between programs and text with a powerful “Program-Transformation” module that generates natural language sentences with complex logic. Furthermore, we optimize the procedure using “Table-Text Manipulator” to handle joint table-text reasoning scenarios. The entire framework utilizes self-training techniques to leverage the unlabeled training data, which results in significant performance improvements when tested on real-world data. Experimental results demonstrate that UCTR-ST achieves above 90% of the supervised model performance on different tasks and domains, reducing the dependence on manual annotation. Additionally, our approach can serve as a data augmentation technique, significantly boosting the performance of supervised models in low-resourced domains.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8996-9010"},"PeriodicalIF":8.9,"publicationDate":"2024-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142220135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Global Optimal Travel Planning for Massive Travel Queries in Road Networks 路网中海量旅行查询的全局最优旅行规划
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-09 DOI: 10.1109/TKDE.2024.3439409
Yehong Xu;Lei Li;Mengxuan Zhang;Zizhuo Xu;Xiaofang Zhou
{"title":"Global Optimal Travel Planning for Massive Travel Queries in Road Networks","authors":"Yehong Xu;Lei Li;Mengxuan Zhang;Zizhuo Xu;Xiaofang Zhou","doi":"10.1109/TKDE.2024.3439409","DOIUrl":"10.1109/TKDE.2024.3439409","url":null,"abstract":"Travel planning plays an increasingly important role in our society. The travel plans, which consist of the paths each vehicle is suggested to follow and its corresponding departure time, influence the traffic conditions naturally. However, existing travel planning algorithms cannot consider the planning results and their influences simultaneously, so traffic congestion could be created when many vehicles are directed to adopt similar travel plans. In this paper, we propose the \u0000<italic>Global Optimal Travel Planning (GOTP)</i>\u0000 problem that aims to minimize traffic congestion by continuously evaluating traffic conditions for a set of planning tasks. Achieving this global optimization goal is non-trivial because travel planning and traffic evaluation are time-consuming and interdependent. To break this dependency, we first propose a \u0000<italic>GOTP</i>\u0000 paradigm that interleaves travel planning and traffic evaluation for queries, where the planning consists of departure time planning and travel path planning. To implement the paradigm, we propose the \u0000<italic>serial model</i>\u0000 that optimizes travel plans one by one, followed by the \u0000<italic>batch model</i>\u0000 that improves processing efficiency, and the \u0000<italic>iterative model</i>\u0000 that further optimizes planning quality. Extensive experiments on large real-world networks with synthetic and real workloads validate the effectiveness and efficiency of our methods.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8377-8394"},"PeriodicalIF":8.9,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141935906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Dynamic Analysis-Powered Explanation Framework for Malware Detection 用于恶意软件检测的动态分析驱动解释框架
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-09 DOI: 10.1109/TKDE.2024.3436891
Huijuan Zhu;Xilong Chen;Liangmin Wang;Zhicheng Xu;Victor S. Sheng
{"title":"A Dynamic Analysis-Powered Explanation Framework for Malware Detection","authors":"Huijuan Zhu;Xilong Chen;Liangmin Wang;Zhicheng Xu;Victor S. Sheng","doi":"10.1109/TKDE.2024.3436891","DOIUrl":"10.1109/TKDE.2024.3436891","url":null,"abstract":"Deep learning has been widely adopted in Android malicious software (malware) detection. However, poor explanation in deep learning-based detection models severely undermines user trusts and poses a significant obstacle to their practical promotion in critical security domains. Some studies strive to uncover the rationale behind a model's decision. Unfortunately, these efforts are often hindered by the limitations of feature extraction methods, such as primarily relying on static analysis to derive separate and approximate behavioral descriptions of applications (apps). As a result, establishing a reliable interpretation for deep learning-based malware detection models remains an open issue. In this work, we propose a novel framework XDeepMal to interpret deep learning-based malware detection models. Specifically, in XDeepMal, we formulate a dynamic analysis tool XTracer\u0000<sup>+</sup>\u0000 to capture runtime behaviors of apps and automatically generate their continuous behavior trajectories. Then, we propose a novel interpreter to pinpoint certainty behavior fragments that are crucial for deep learning models to make their decisions. This approach regards the identification of the most critical fragments as an optimization problem and leverages heuristic algorithms for implementation. We conduct extensive experiments on a real-world dataset to investigate the effectiveness and reliability of XDeepMal. These experiments cover intuitive case studies (malware family and individual app) and in-depth quantitative analysis. Additionally, we evaluate its coverage and efficiency. Our experimental results demonstrate that XDeepMal is capable of generating convincing interpretations for deep learning (e.g., Transformer) based models within feasible inference time, which greatly benefits security analysts in accurately comprehending why an app is identified as malware by deep learning-based detection models.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"7483-7496"},"PeriodicalIF":8.9,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141936094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hyperedge Graph Contrastive Learning Hyperedge 图形对比学习
IF 8.9 2区 计算机科学
IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-08-09 DOI: 10.1109/TKDE.2024.3435861
Junfeng Zhang;Weixin Zeng;Jiuyang Tang;Xiang Zhao
{"title":"Hyperedge Graph Contrastive Learning","authors":"Junfeng Zhang;Weixin Zeng;Jiuyang Tang;Xiang Zhao","doi":"10.1109/TKDE.2024.3435861","DOIUrl":"10.1109/TKDE.2024.3435861","url":null,"abstract":"Although various graph contrastive learning (GCL) techniques have been employed to generate augmented views and maximize their mutual information, current solutions only consider the pairwise relationships based on edges, neglecting the high-order information that can help generate more informative augmented views and make better contrast. To fill in this gap, we propose to leverage hyperedge to facilitate GCL, as it connects two or more nodes and can model high-order relationships among multiple nodes. More specifically, hyperedges are constructed based on the original graph. Then, we conduct node-level PageRank based on hyperedges and hyperedge-level PageRank based on nodes to generate augmented views. As to the contrasting stage, different from existing GCL methods that simply treat the corresponding nodes of the anchor in different views as positives and overlook certain nodes strongly associated with the anchor, we build the positives and negatives based on hyperedges, where whether a node is a positive is determined by the number of hyperedges it coexists with the anchor. We compare our hyperedge GCL with state-of-the-art methods on downstream tasks, and the empirical results validate the superiority of our proposal. Further experiments on graph augmentation and graph contrastive loss also demonstrate the effectiveness of the proposed modules.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8502-8514"},"PeriodicalIF":8.9,"publicationDate":"2024-08-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141935923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信