GRAIL:平衡负抽样的图对比学习

IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS
Chengcheng Xu , Tianfeng Wang , Man Chen , Jun Chen , Wei Li , Zhisong Pan
{"title":"GRAIL:平衡负抽样的图对比学习","authors":"Chengcheng Xu ,&nbsp;Tianfeng Wang ,&nbsp;Man Chen ,&nbsp;Jun Chen ,&nbsp;Wei Li ,&nbsp;Zhisong Pan","doi":"10.1016/j.ipm.2025.104211","DOIUrl":null,"url":null,"abstract":"<div><div>Currently, some graph contrastive learning methods mitigate the class imbalance by balancing the number of anchors, overlooking the crucial role of negative samples in forming a regular simplex. Moreover, existing strategies select a limited number of positive samples with poor quality, causing the model to erroneously push away nodes with similar semantics. To address these issues, we propose a <strong>g</strong>raph cont<strong>r</strong>astive learning method with b<strong>a</strong>lanced negat<strong>i</strong>ve samp<strong>l</strong>ing, named GRAIL. Specifically, GRAIL introduces a multi-head similarity metric that leverages mixed probability distributions related to dimensional elements to adaptively select an equal number of hard negative samples within each non-anchor cluster. As a result, GRAIL not only promotes the formation of a regular simplex by balancing the gradient contributions of different negative classes but also selects the most informative hard negative samples to improve the distinguishing ability of minority classes while minimizing the impact on majority classes. Furthermore, GRAIL selects multiple positive samples with a high correct ratio using structural similarity and feature similarity, thereby enabling the model to learn trustworthy node representations. Since traditional contrastive loss focuses on the majority class while neglecting the minority class, a balanced contrastive loss is introduced to optimize node representations. Experiments on node classification, node clustering, and link prediction tasks across six imbalanced graph datasets demonstrate that GRAIL outperforms existing state-of-the-art methods. The source code is available at <span><span>https://github.com/xushucheng-coder/GRAIL/tree/master</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 5","pages":"Article 104211"},"PeriodicalIF":7.4000,"publicationDate":"2025-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"GRAIL: Graph contrastive learning with balanced negative sampling\",\"authors\":\"Chengcheng Xu ,&nbsp;Tianfeng Wang ,&nbsp;Man Chen ,&nbsp;Jun Chen ,&nbsp;Wei Li ,&nbsp;Zhisong Pan\",\"doi\":\"10.1016/j.ipm.2025.104211\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Currently, some graph contrastive learning methods mitigate the class imbalance by balancing the number of anchors, overlooking the crucial role of negative samples in forming a regular simplex. Moreover, existing strategies select a limited number of positive samples with poor quality, causing the model to erroneously push away nodes with similar semantics. To address these issues, we propose a <strong>g</strong>raph cont<strong>r</strong>astive learning method with b<strong>a</strong>lanced negat<strong>i</strong>ve samp<strong>l</strong>ing, named GRAIL. Specifically, GRAIL introduces a multi-head similarity metric that leverages mixed probability distributions related to dimensional elements to adaptively select an equal number of hard negative samples within each non-anchor cluster. As a result, GRAIL not only promotes the formation of a regular simplex by balancing the gradient contributions of different negative classes but also selects the most informative hard negative samples to improve the distinguishing ability of minority classes while minimizing the impact on majority classes. Furthermore, GRAIL selects multiple positive samples with a high correct ratio using structural similarity and feature similarity, thereby enabling the model to learn trustworthy node representations. Since traditional contrastive loss focuses on the majority class while neglecting the minority class, a balanced contrastive loss is introduced to optimize node representations. Experiments on node classification, node clustering, and link prediction tasks across six imbalanced graph datasets demonstrate that GRAIL outperforms existing state-of-the-art methods. The source code is available at <span><span>https://github.com/xushucheng-coder/GRAIL/tree/master</span><svg><path></path></svg></span>.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 5\",\"pages\":\"Article 104211\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2025-05-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457325001529\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325001529","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0

摘要

目前,一些图对比学习方法通过平衡锚点的数量来缓解类不平衡,忽略了负样本在形成规则单纯形中的关键作用。此外,现有策略选择的阳性样本数量有限,质量较差,导致模型错误地推离语义相似的节点。为了解决这些问题,我们提出了一种平衡负抽样的图对比学习方法,称为GRAIL。具体来说,GRAIL引入了一个多头相似性度量,该度量利用与维度元素相关的混合概率分布,自适应地在每个非锚点聚类中选择相同数量的硬负样本。因此,GRAIL不仅通过平衡不同负类的梯度贡献来促进规则单纯形的形成,而且还选择信息量最大的硬负样本来提高少数类的区分能力,同时最大限度地减少对多数类的影响。此外,GRAIL利用结构相似度和特征相似度选择正确率高的多个正样本,从而使模型能够学习可信节点表示。由于传统的对比损失主要关注多数类而忽略了少数类,因此引入平衡对比损失来优化节点表示。在六个不平衡图数据集上对节点分类、节点聚类和链接预测任务进行的实验表明,GRAIL优于现有的最先进的方法。源代码可从https://github.com/xushucheng-coder/GRAIL/tree/master获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
GRAIL: Graph contrastive learning with balanced negative sampling
Currently, some graph contrastive learning methods mitigate the class imbalance by balancing the number of anchors, overlooking the crucial role of negative samples in forming a regular simplex. Moreover, existing strategies select a limited number of positive samples with poor quality, causing the model to erroneously push away nodes with similar semantics. To address these issues, we propose a graph contrastive learning method with balanced negative sampling, named GRAIL. Specifically, GRAIL introduces a multi-head similarity metric that leverages mixed probability distributions related to dimensional elements to adaptively select an equal number of hard negative samples within each non-anchor cluster. As a result, GRAIL not only promotes the formation of a regular simplex by balancing the gradient contributions of different negative classes but also selects the most informative hard negative samples to improve the distinguishing ability of minority classes while minimizing the impact on majority classes. Furthermore, GRAIL selects multiple positive samples with a high correct ratio using structural similarity and feature similarity, thereby enabling the model to learn trustworthy node representations. Since traditional contrastive loss focuses on the majority class while neglecting the minority class, a balanced contrastive loss is introduced to optimize node representations. Experiments on node classification, node clustering, and link prediction tasks across six imbalanced graph datasets demonstrate that GRAIL outperforms existing state-of-the-art methods. The source code is available at https://github.com/xushucheng-coder/GRAIL/tree/master.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Information Processing & Management
Information Processing & Management 工程技术-计算机:信息系统
CiteScore
17.00
自引率
11.60%
发文量
276
审稿时长
39 days
期刊介绍: Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信