Efficient and Effective Augmentation Framework With Latent Mixup and Label-Guided Contrastive Learning for Graph Classification

IF 8.9 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2024-09-30 DOI:10.1109/TKDE.2024.3471659

Aoting Zeng;Liping Wang;Wenjie Zhang;Xuemin Lin

{"title":"Efficient and Effective Augmentation Framework With Latent Mixup and Label-Guided Contrastive Learning for Graph Classification","authors":"Aoting Zeng;Liping Wang;Wenjie Zhang;Xuemin Lin","doi":"10.1109/TKDE.2024.3471659","DOIUrl":null,"url":null,"abstract":"Graph Neural Networks (GNNs) with data augmentation obtain promising results among existing solutions for graph classification. Mixup-based augmentation methods for graph classification have already achieved state-of-the-art performance. However, existing mixup-based augmentation methods either operate in the input space and thus face the challenge of balancing efficiency and accuracy, or directly conduct mixup in the latent space without similarity guarantee, thus leading to lacking semantic validity and limited performance. To address these limitations, this paper proposes \n<inline-formula><tex-math>$\\mathcal {G}$</tex-math></inline-formula>\n-MixCon, a novel framework leveraging the strengths of \nMix\nup-based augmentation and supervised \nCon\ntrastive learning (SCL). To the best of our knowledge, this is the first attempt to develop an SCL-based approach for learning graph representations. Specifically, the mixup-based strategy within the latent space named \n<inline-formula><tex-math>$GDA_{gl}$</tex-math></inline-formula>\n and \n<inline-formula><tex-math>$GDA_{nl}$</tex-math></inline-formula>\n are proposed, which efficiently conduct linear interpolation between views of the node or graph level. Furthermore, we design a dual-objective loss function named \nSupMixCon\n that can consider both the consistency among graphs and the distances between the original and augmented graph. \nSupMixCon\n can guide the training process for SCL in \n<inline-formula><tex-math>$\\mathcal {G}$</tex-math></inline-formula>\n-MixCon while achieving a similarity guarantee. Comprehensive experiments are conducted on various real-world datasets, the results show that \n<inline-formula><tex-math>$\\mathcal {G}$</tex-math></inline-formula>\n-MixCon demonstrably enhances performance, achieving an average accuracy increment of 6.24%, and significantly increases the robustness of GNNs against noisy labels.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"36 12","pages":"8066-8078"},"PeriodicalIF":8.9000,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10700965/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Graph Neural Networks (GNNs) with data augmentation obtain promising results among existing solutions for graph classification. Mixup-based augmentation methods for graph classification have already achieved state-of-the-art performance. However, existing mixup-based augmentation methods either operate in the input space and thus face the challenge of balancing efficiency and accuracy, or directly conduct mixup in the latent space without similarity guarantee, thus leading to lacking semantic validity and limited performance. To address these limitations, this paper proposes

$\mathcal {G}$

-MixCon, a novel framework leveraging the strengths of Mix up-based augmentation and supervised Con trastive learning (SCL). To the best of our knowledge, this is the first attempt to develop an SCL-based approach for learning graph representations. Specifically, the mixup-based strategy within the latent space named

$GDA_{gl}$

and

$GDA_{nl}$

are proposed, which efficiently conduct linear interpolation between views of the node or graph level. Furthermore, we design a dual-objective loss function named SupMixCon that can consider both the consistency among graphs and the distances between the original and augmented graph. SupMixCon can guide the training process for SCL in

$\mathcal {G}$

-MixCon while achieving a similarity guarantee. Comprehensive experiments are conducted on various real-world datasets, the results show that

$\mathcal {G}$

-MixCon demonstrably enhances performance, achieving an average accuracy increment of 6.24%, and significantly increases the robustness of GNNs against noisy labels.

查看原文本刊更多论文

利用潜在混合和标签引导对比学习实现图分类的高效增强框架

在现有的图分类解决方案中，具有数据增强功能的图神经网络（GNN）取得了可喜的成果。基于混合的图分类增强方法已经取得了最先进的性能。然而，现有的基于混合的增强方法要么是在输入空间中进行操作，因此面临着平衡效率和准确性的挑战；要么是直接在潜空间中进行混合，而没有相似性保证，因此导致缺乏语义有效性和性能有限。为了解决这些局限性，本文提出了$\mathcal {G}$-MixCon, 一个利用基于混合的增强和有监督对比学习（SCL）优势的新型框架。据我们所知，这是首次尝试开发基于 SCL 的图表示学习方法。具体来说，我们提出了在名为$GDA_{gl}$和$GDA_{nl}$的潜空间内基于混合的策略，它能有效地在节点或图层视图之间进行线性插值。此外，我们还设计了一种名为 SupMixCon 的双目标损失函数，它既能考虑图之间的一致性，也能考虑原始图和增强图之间的距离。SupMixCon 可以指导 $\mathcal {G}$-MixCon 中 SCL 的训练过程，同时实现相似性保证。我们在各种实际数据集上进行了综合实验，结果表明$\mathcal {G}$-MixCon 明显提高了性能，平均准确率提高了6.24%，并显著增强了GNN对噪声标签的鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.