ConUMIP: Continuous-time dynamic graph learning via uncertainty masked mix-up on representation space

IF 7.2 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Knowledge-Based Systems Pub Date : 2024-11-15 DOI:10.1016/j.knosys.2024.112748

Haoyu Zhang, Xuchu Jiang

{"title":"ConUMIP: Continuous-time dynamic graph learning via uncertainty masked mix-up on representation space","authors":"Haoyu Zhang, Xuchu Jiang","doi":"10.1016/j.knosys.2024.112748","DOIUrl":null,"url":null,"abstract":"<div><div>Representation learning on continuous-time dynamic graphs has garnered substantial attention for its capacity to model evolving entity relationships. However, existing methods exhibit pronounced overfitting, particularly in complex and sparse data scenarios. We empirically substantiate this overfitting through multiple indicators: (1) a significant performance discrepancy between training and validation/test sets, especially for long-term interaction predictions; (2) an inverse correlation between model complexity and generalization performance; (3) a widening temporal generalization gap as the prediction horizons extend; and (4) rapid performance deterioration under data-sparse conditions. These phenomena collectively demonstrate the overfitting issue, limiting the applicability of current approaches in cold-start scenarios and dynamic environments. To address this, we propose <strong>Con</strong>tinuous-Time Dynamic Graph Learning via <strong>U</strong>ncertainty <strong>M</strong>asked M<strong>I</strong>x-U<strong>P</strong> (ConUMIP), a novel data augmentation method operating in the representation space of continuous-time dynamic graphs. Unlike conventional techniques that perturb raw graph data, ConUMIP adaptively captures temporal evolution patterns and generates diverse augmented samples. This approach effectively mitigates overfitting while enhancing long-term dependency modeling. By eschewing predefined time windows and integrating both local and global structures, ConUMIP demonstrates superior adaptation to complex dynamic evolution patterns. Comprehensive evaluations across five real-world datasets validate ConUMIP's efficacy in substantially improving both the performance and generalizability of existing continuous-time dynamic graph models, particularly in long-term predictions and data-sparse scenarios, without incurring additional computational complexity, thus offering a robust solution to the overfitting challenge in this domain.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"306 ","pages":"Article 112748"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013820","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Representation learning on continuous-time dynamic graphs has garnered substantial attention for its capacity to model evolving entity relationships. However, existing methods exhibit pronounced overfitting, particularly in complex and sparse data scenarios. We empirically substantiate this overfitting through multiple indicators: (1) a significant performance discrepancy between training and validation/test sets, especially for long-term interaction predictions; (2) an inverse correlation between model complexity and generalization performance; (3) a widening temporal generalization gap as the prediction horizons extend; and (4) rapid performance deterioration under data-sparse conditions. These phenomena collectively demonstrate the overfitting issue, limiting the applicability of current approaches in cold-start scenarios and dynamic environments. To address this, we propose Continuous-Time Dynamic Graph Learning via Uncertainty Masked MIx-UP (ConUMIP), a novel data augmentation method operating in the representation space of continuous-time dynamic graphs. Unlike conventional techniques that perturb raw graph data, ConUMIP adaptively captures temporal evolution patterns and generates diverse augmented samples. This approach effectively mitigates overfitting while enhancing long-term dependency modeling. By eschewing predefined time windows and integrating both local and global structures, ConUMIP demonstrates superior adaptation to complex dynamic evolution patterns. Comprehensive evaluations across five real-world datasets validate ConUMIP's efficacy in substantially improving both the performance and generalizability of existing continuous-time dynamic graph models, particularly in long-term predictions and data-sparse scenarios, without incurring additional computational complexity, thus offering a robust solution to the overfitting challenge in this domain.

查看原文本刊更多论文

ConUMIP：通过表征空间上的不确定性掩蔽混合进行连续时间动态图学习

连续时间动态图的表征学习因其能够模拟不断变化的实体关系而备受关注。然而，现有的方法表现出明显的过拟合，尤其是在复杂和稀疏的数据场景中。我们通过多个指标实证了这种过拟合现象：(1) 训练集和验证/测试集之间存在明显的性能差异，尤其是在长期交互预测方面；(2) 模型复杂性和泛化性能之间存在反相关关系；(3) 随着预测范围的扩大，泛化的时间差距也在扩大；(4) 在数据稀少的条件下，性能迅速下降。这些现象共同表明了过拟合问题，限制了当前方法在冷启动场景和动态环境中的适用性。为了解决这个问题，我们提出了通过不确定性掩蔽 MIx-UP 进行连续时间动态图学习（ConUMIP），这是一种在连续时间动态图表示空间中运行的新型数据增强方法。与扰动原始图数据的传统技术不同，ConUMIP 能够自适应地捕捉时间演化模式并生成多样化的增强样本。这种方法在增强长期依赖性建模的同时，还能有效缓解过度拟合问题。通过摒弃预定义的时间窗口并整合局部和全局结构，ConUMIP 展示了对复杂动态演化模式的卓越适应性。对五个真实数据集的全面评估验证了 ConUMIP 在大幅提高现有连续时间动态图模型的性能和普适性方面的功效，尤其是在长期预测和数据稀缺的情况下，而不会产生额外的计算复杂性，从而为该领域的过拟合挑战提供了一个稳健的解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Knowledge-Based Systems 工程技术-计算机：人工智能

CiteScore

14.80

自引率

12.50%

发文量

1245

审稿时长

7.8 months

期刊介绍： Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.