Self-supervised learning of invariant causal representation in heterogeneous information network

IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Pei Zhang , Lihua Zhou , Yong Li , Hongmei Chen , Lizhen Wang
{"title":"Self-supervised learning of invariant causal representation in heterogeneous information network","authors":"Pei Zhang ,&nbsp;Lihua Zhou ,&nbsp;Yong Li ,&nbsp;Hongmei Chen ,&nbsp;Lizhen Wang","doi":"10.1016/j.inffus.2025.103246","DOIUrl":null,"url":null,"abstract":"<div><div>Invariant learning on graphs is essential for uncovering causal relationships in complex phenomena. However, most research has focused on homogeneous information networks with single node and edge types, ignoring the rich heterogeneity of real-world systems. Additionally, many invariant learning methods rely on labeled data and the design of complex graph augmentation or contrastive sampling algorithms, requiring domain-specific expertise or substantial human resources, making them difficult to implement in practical applications. To overcome these limitations, we propose a <strong>G</strong>enerative-<strong>C</strong>ontrastive <strong>C</strong>ollaborative <strong>S</strong>elf-Supervised Learning (GCCS) framework. This framework combines the ability of generative learning to mine supervisory signals from the data itself with the capacity of contrastive learning to learn invariant representations, enabling self-supervised learning of invariant causal representations from heterogeneous information networks (HINs). Specifically, generative self-supervised learning (SSL) constructs meta-path aware adjacency matrices and performs a mask-reconstruct operation, while contrastive SSL refines the learned representations by enforcing similarity and consensus constraints across different views. This joint optimization captures invariant causal features, enhancing the model’s robustness. Extensive experiments on three real-world HINs datasets demonstrate that GCCS outperforms state-of-the-art baselines, particularly in noisy and complex environments, showcasing its superior performance and robustness for self-supervised learning in heterogeneous graph structures.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"123 ","pages":"Article 103246"},"PeriodicalIF":14.7000,"publicationDate":"2025-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Fusion","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1566253525003197","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Invariant learning on graphs is essential for uncovering causal relationships in complex phenomena. However, most research has focused on homogeneous information networks with single node and edge types, ignoring the rich heterogeneity of real-world systems. Additionally, many invariant learning methods rely on labeled data and the design of complex graph augmentation or contrastive sampling algorithms, requiring domain-specific expertise or substantial human resources, making them difficult to implement in practical applications. To overcome these limitations, we propose a Generative-Contrastive Collaborative Self-Supervised Learning (GCCS) framework. This framework combines the ability of generative learning to mine supervisory signals from the data itself with the capacity of contrastive learning to learn invariant representations, enabling self-supervised learning of invariant causal representations from heterogeneous information networks (HINs). Specifically, generative self-supervised learning (SSL) constructs meta-path aware adjacency matrices and performs a mask-reconstruct operation, while contrastive SSL refines the learned representations by enforcing similarity and consensus constraints across different views. This joint optimization captures invariant causal features, enhancing the model’s robustness. Extensive experiments on three real-world HINs datasets demonstrate that GCCS outperforms state-of-the-art baselines, particularly in noisy and complex environments, showcasing its superior performance and robustness for self-supervised learning in heterogeneous graph structures.
异构信息网络中不变因果表示的自监督学习
图上的不变量学习对于揭示复杂现象中的因果关系至关重要。然而,大多数研究都集中在具有单一节点和边缘类型的同质信息网络上,忽视了现实世界系统的丰富异质性。此外,许多不变学习方法依赖于标记数据和复杂图增强或对比采样算法的设计,需要特定领域的专业知识或大量的人力资源,这使得它们难以在实际应用中实现。为了克服这些限制,我们提出了一个生成-对比协作自监督学习(GCCS)框架。该框架结合了从数据本身挖掘监督信号的生成学习能力和学习不变表示的对比学习能力,从而实现了从异构信息网络(HINs)中对不变因果表示的自监督学习。具体来说,生成式自监督学习(SSL)构建元路径感知邻接矩阵并执行掩模重构操作,而对比式SSL通过在不同视图之间强制相似性和一致性约束来改进学习到的表示。这种联合优化捕获了不变的因果特征,增强了模型的鲁棒性。在三个真实世界的HINs数据集上进行的大量实验表明,GCCS优于最先进的基线,特别是在嘈杂和复杂的环境中,展示了其在异构图结构中自监督学习的优越性能和鲁棒性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Information Fusion
Information Fusion 工程技术-计算机:理论方法
CiteScore
33.20
自引率
4.30%
发文量
161
审稿时长
7.9 months
期刊介绍: Information Fusion serves as a central platform for showcasing advancements in multi-sensor, multi-source, multi-process information fusion, fostering collaboration among diverse disciplines driving its progress. It is the leading outlet for sharing research and development in this field, focusing on architectures, algorithms, and applications. Papers dealing with fundamental theoretical analyses as well as those demonstrating their application to real-world problems will be welcome.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信