Distill & Contrast: A New Graph Self-Supervised Method With Approximating Nature Data Relationships

IF 8.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Dongxiao He;Jitao Zhao;Rui Guo;Zhiyong Feng;Cuiying Huo;Di Jin;Witold Pedrycz;Weixiong Zhang
{"title":"Distill & Contrast: A New Graph Self-Supervised Method With Approximating Nature Data Relationships","authors":"Dongxiao He;Jitao Zhao;Rui Guo;Zhiyong Feng;Cuiying Huo;Di Jin;Witold Pedrycz;Weixiong Zhang","doi":"10.1109/TKDE.2025.3554524","DOIUrl":null,"url":null,"abstract":"Contrastive Learning (CL) has emerged as a popular self-supervised representation learning paradigm that has been shown in many applications to perform similarly to traditional supervised learning methods. A key component of CL is mining the latent discriminative relationships between positive and negative samples and using them as self-supervised labels. We argue that this discriminative contrastive task is, in essence, similar to a classification task, and the “either positive or negative” hard label sampling strategies are arbitrary. To solve this problem, we explore ideas from data distillation, which considers probabilistic logit vectors as soft labels to transfer model knowledge. We attempt to abandon the classical hard sampling labels in CL and instead explore self-supervised soft labels. We adopt soft sampling labels that are extracted, without supervision, from the inherent relationships in data pairs to retain more information. We propose a new self-supervised graph learning method, Distill and Contrast (D&C), for learning representations that closely approximate natural data relationships. D&C extracts node similarities from the features and structures to derive soft sampling labels, which also eliminate noise in the data to increase robustness. Extensive experimental results on real-world datasets demonstrate the effectiveness of the proposed method.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3284-3297"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10938656/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Contrastive Learning (CL) has emerged as a popular self-supervised representation learning paradigm that has been shown in many applications to perform similarly to traditional supervised learning methods. A key component of CL is mining the latent discriminative relationships between positive and negative samples and using them as self-supervised labels. We argue that this discriminative contrastive task is, in essence, similar to a classification task, and the “either positive or negative” hard label sampling strategies are arbitrary. To solve this problem, we explore ideas from data distillation, which considers probabilistic logit vectors as soft labels to transfer model knowledge. We attempt to abandon the classical hard sampling labels in CL and instead explore self-supervised soft labels. We adopt soft sampling labels that are extracted, without supervision, from the inherent relationships in data pairs to retain more information. We propose a new self-supervised graph learning method, Distill and Contrast (D&C), for learning representations that closely approximate natural data relationships. D&C extracts node similarities from the features and structures to derive soft sampling labels, which also eliminate noise in the data to increase robustness. Extensive experimental results on real-world datasets demonstrate the effectiveness of the proposed method.
提取与对比:一种新的逼近自然数据关系的图自监督方法
对比学习(CL)已成为一种流行的自监督表征学习范式,在许多应用中显示出与传统的监督学习方法相似的性能。CL的一个关键组成部分是挖掘正样本和负样本之间的潜在判别关系,并将它们用作自监督标签。我们认为,这种判别对比任务在本质上类似于分类任务,而“要么是正的,要么是负的”硬标签抽样策略是任意的。为了解决这个问题,我们探索了数据蒸馏的思想,它将概率logit向量作为转移模型知识的软标签。我们尝试放弃经典的硬采样标签,转而探索自监督软标签。我们采用软采样标签,在没有监督的情况下,从数据对的内在关系中提取标签,以保留更多的信息。我们提出了一种新的自监督图学习方法,提取和对比(D&C),用于学习接近自然数据关系的表示。D&C从特征和结构中提取节点相似度,得到软采样标签,同时消除数据中的噪声,增强鲁棒性。在实际数据集上的大量实验结果证明了该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Knowledge and Data Engineering
IEEE Transactions on Knowledge and Data Engineering 工程技术-工程:电子与电气
CiteScore
11.70
自引率
3.40%
发文量
515
审稿时长
6 months
期刊介绍: The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信