{"title":"Summary Graph Induced Invariant Learning for Generalizable Graph Learning","authors":"Xuecheng Ning;Yujie Wang;Kui Yu;Jiali Miao;Fuyuan Cao;Jiye Liang","doi":"10.1109/TKDE.2025.3547226","DOIUrl":null,"url":null,"abstract":"As a promising strategy to achieve generalizable graph learning tasks, graph invariant learning emphasizes identifying invariant subgraphs for stable predictions on biased unknown distribution by selecting the important edges/nodes based on their contributions to the predictive tasks (i.e., subgraph predictivity). However, the existing approaches solely relying on subgraph predictivity face a challenge: the learned invariant subgraph often contains numerous spurious nodes and shows poor connectivity, undermining the generalization power of Graph Neural Networks (GNNs). To tackle this issue, we propose a summary graph-induced Invariant Learning (SIL) model that innovatively adopts a summary graph to leverage both the subgraph connectivity and predictivity for learning strong connected and accurate invariant subgraphs. Specifically, SIL first learns a summary graph containing multiple strongly connected supernodes while maintaining structure consistency with the original graph. Second, the learned summary graph is disentangled into an invariant supernode and spurious counterparts to eliminate the interference of highly predictive edges and nodes. Finally, SIL identifies a potential invariant subgraph from the invariant supernode to accomplish generalization tasks. Additionally, we provide a theoretical analysis of the summary graph learning mechanism, guaranteeing that the learned summary graph is consistent with the original graph. Experimental results validate the effectiveness of the SIL model.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 6","pages":"3739-3752"},"PeriodicalIF":8.9000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10908694/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
As a promising strategy to achieve generalizable graph learning tasks, graph invariant learning emphasizes identifying invariant subgraphs for stable predictions on biased unknown distribution by selecting the important edges/nodes based on their contributions to the predictive tasks (i.e., subgraph predictivity). However, the existing approaches solely relying on subgraph predictivity face a challenge: the learned invariant subgraph often contains numerous spurious nodes and shows poor connectivity, undermining the generalization power of Graph Neural Networks (GNNs). To tackle this issue, we propose a summary graph-induced Invariant Learning (SIL) model that innovatively adopts a summary graph to leverage both the subgraph connectivity and predictivity for learning strong connected and accurate invariant subgraphs. Specifically, SIL first learns a summary graph containing multiple strongly connected supernodes while maintaining structure consistency with the original graph. Second, the learned summary graph is disentangled into an invariant supernode and spurious counterparts to eliminate the interference of highly predictive edges and nodes. Finally, SIL identifies a potential invariant subgraph from the invariant supernode to accomplish generalization tasks. Additionally, we provide a theoretical analysis of the summary graph learning mechanism, guaranteeing that the learned summary graph is consistent with the original graph. Experimental results validate the effectiveness of the SIL model.
期刊介绍:
The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.