ConUMIP: Continuous-time dynamic graph learning via uncertainty masked mix-up on representation space

IF 7.2 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Haoyu Zhang, Xuchu Jiang
{"title":"ConUMIP: Continuous-time dynamic graph learning via uncertainty masked mix-up on representation space","authors":"Haoyu Zhang,&nbsp;Xuchu Jiang","doi":"10.1016/j.knosys.2024.112748","DOIUrl":null,"url":null,"abstract":"<div><div>Representation learning on continuous-time dynamic graphs has garnered substantial attention for its capacity to model evolving entity relationships. However, existing methods exhibit pronounced overfitting, particularly in complex and sparse data scenarios. We empirically substantiate this overfitting through multiple indicators: (1) a significant performance discrepancy between training and validation/test sets, especially for long-term interaction predictions; (2) an inverse correlation between model complexity and generalization performance; (3) a widening temporal generalization gap as the prediction horizons extend; and (4) rapid performance deterioration under data-sparse conditions. These phenomena collectively demonstrate the overfitting issue, limiting the applicability of current approaches in cold-start scenarios and dynamic environments. To address this, we propose <strong>Con</strong>tinuous-Time Dynamic Graph Learning via <strong>U</strong>ncertainty <strong>M</strong>asked M<strong>I</strong>x-U<strong>P</strong> (ConUMIP), a novel data augmentation method operating in the representation space of continuous-time dynamic graphs. Unlike conventional techniques that perturb raw graph data, ConUMIP adaptively captures temporal evolution patterns and generates diverse augmented samples. This approach effectively mitigates overfitting while enhancing long-term dependency modeling. By eschewing predefined time windows and integrating both local and global structures, ConUMIP demonstrates superior adaptation to complex dynamic evolution patterns. Comprehensive evaluations across five real-world datasets validate ConUMIP's efficacy in substantially improving both the performance and generalizability of existing continuous-time dynamic graph models, particularly in long-term predictions and data-sparse scenarios, without incurring additional computational complexity, thus offering a robust solution to the overfitting challenge in this domain.</div></div>","PeriodicalId":49939,"journal":{"name":"Knowledge-Based Systems","volume":"306 ","pages":"Article 112748"},"PeriodicalIF":7.2000,"publicationDate":"2024-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Knowledge-Based Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0950705124013820","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Representation learning on continuous-time dynamic graphs has garnered substantial attention for its capacity to model evolving entity relationships. However, existing methods exhibit pronounced overfitting, particularly in complex and sparse data scenarios. We empirically substantiate this overfitting through multiple indicators: (1) a significant performance discrepancy between training and validation/test sets, especially for long-term interaction predictions; (2) an inverse correlation between model complexity and generalization performance; (3) a widening temporal generalization gap as the prediction horizons extend; and (4) rapid performance deterioration under data-sparse conditions. These phenomena collectively demonstrate the overfitting issue, limiting the applicability of current approaches in cold-start scenarios and dynamic environments. To address this, we propose Continuous-Time Dynamic Graph Learning via Uncertainty Masked MIx-UP (ConUMIP), a novel data augmentation method operating in the representation space of continuous-time dynamic graphs. Unlike conventional techniques that perturb raw graph data, ConUMIP adaptively captures temporal evolution patterns and generates diverse augmented samples. This approach effectively mitigates overfitting while enhancing long-term dependency modeling. By eschewing predefined time windows and integrating both local and global structures, ConUMIP demonstrates superior adaptation to complex dynamic evolution patterns. Comprehensive evaluations across five real-world datasets validate ConUMIP's efficacy in substantially improving both the performance and generalizability of existing continuous-time dynamic graph models, particularly in long-term predictions and data-sparse scenarios, without incurring additional computational complexity, thus offering a robust solution to the overfitting challenge in this domain.
ConUMIP:通过表征空间上的不确定性掩蔽混合进行连续时间动态图学习
连续时间动态图的表征学习因其能够模拟不断变化的实体关系而备受关注。然而,现有的方法表现出明显的过拟合,尤其是在复杂和稀疏的数据场景中。我们通过多个指标实证了这种过拟合现象:(1) 训练集和验证/测试集之间存在明显的性能差异,尤其是在长期交互预测方面;(2) 模型复杂性和泛化性能之间存在反相关关系;(3) 随着预测范围的扩大,泛化的时间差距也在扩大;(4) 在数据稀少的条件下,性能迅速下降。这些现象共同表明了过拟合问题,限制了当前方法在冷启动场景和动态环境中的适用性。为了解决这个问题,我们提出了通过不确定性掩蔽 MIx-UP 进行连续时间动态图学习(ConUMIP),这是一种在连续时间动态图表示空间中运行的新型数据增强方法。与扰动原始图数据的传统技术不同,ConUMIP 能够自适应地捕捉时间演化模式并生成多样化的增强样本。这种方法在增强长期依赖性建模的同时,还能有效缓解过度拟合问题。通过摒弃预定义的时间窗口并整合局部和全局结构,ConUMIP 展示了对复杂动态演化模式的卓越适应性。对五个真实数据集的全面评估验证了 ConUMIP 在大幅提高现有连续时间动态图模型的性能和普适性方面的功效,尤其是在长期预测和数据稀缺的情况下,而不会产生额外的计算复杂性,从而为该领域的过拟合挑战提供了一个稳健的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Knowledge-Based Systems
Knowledge-Based Systems 工程技术-计算机:人工智能
CiteScore
14.80
自引率
12.50%
发文量
1245
审稿时长
7.8 months
期刊介绍: Knowledge-Based Systems, an international and interdisciplinary journal in artificial intelligence, publishes original, innovative, and creative research results in the field. It focuses on knowledge-based and other artificial intelligence techniques-based systems. The journal aims to support human prediction and decision-making through data science and computation techniques, provide a balanced coverage of theory and practical study, and encourage the development and implementation of knowledge-based intelligence models, methods, systems, and software tools. Applications in business, government, education, engineering, and healthcare are emphasized.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信