Improving Sequential Recommendations via Bidirectional Temporal Data Augmentation With Pre-Training

IF 10.4 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE Transactions on Knowledge and Data Engineering Pub Date : 2025-02-26 DOI:10.1109/TKDE.2025.3546035

Juyong Jiang;Peiyan Zhang;Yingtao Luo;Chaozhuo Li;Jae Boum Kim;Kai Zhang;Senzhang Wang;Sunghun Kim;Philip S. Yu

{"title":"Improving Sequential Recommendations via Bidirectional Temporal Data Augmentation With Pre-Training","authors":"Juyong Jiang;Peiyan Zhang;Yingtao Luo;Chaozhuo Li;Jae Boum Kim;Kai Zhang;Senzhang Wang;Sunghun Kim;Philip S. Yu","doi":"10.1109/TKDE.2025.3546035","DOIUrl":null,"url":null,"abstract":"Sequential recommendation systems are integral to discerning temporal user preferences. Yet, the task of learning from abbreviated user interaction sequences poses a notable challenge. Data augmentation has been identified as a potent strategy to enhance the informational richness of these sequences. Traditional augmentation techniques, such as item randomization, may disrupt the inherent temporal dynamics. Although recent advancements in reverse chronological pseudo-item generation have shown promise, they can introduce temporal discrepancies when assessed in a natural chronological context. In response, we introduce a sophisticated approach, Bidirectional temporal data Augmentation with pre-training (BARec). Our approach leverages bidirectional temporal augmentation and knowledge-enhanced fine-tuning to synthesize authentic pseudo-prior items that <italic>retain user preferences and capture deeper item semantic correlations</i>, thus boosting the model’s expressive power. Our comprehensive experimental analysis on five benchmark datasets confirms the superiority of BARec across both short and elongated sequence contexts. Moreover, theoretical examination and case study offer further insight into the model’s logical processes and interpretability.","PeriodicalId":13496,"journal":{"name":"IEEE Transactions on Knowledge and Data Engineering","volume":"37 5","pages":"2652-2664"},"PeriodicalIF":10.4000,"publicationDate":"2025-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Knowledge and Data Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10904280/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Sequential recommendation systems are integral to discerning temporal user preferences. Yet, the task of learning from abbreviated user interaction sequences poses a notable challenge. Data augmentation has been identified as a potent strategy to enhance the informational richness of these sequences. Traditional augmentation techniques, such as item randomization, may disrupt the inherent temporal dynamics. Although recent advancements in reverse chronological pseudo-item generation have shown promise, they can introduce temporal discrepancies when assessed in a natural chronological context. In response, we introduce a sophisticated approach, Bidirectional temporal data Augmentation with pre-training (BARec). Our approach leverages bidirectional temporal augmentation and knowledge-enhanced fine-tuning to synthesize authentic pseudo-prior items that retain user preferences and capture deeper item semantic correlations, thus boosting the model’s expressive power. Our comprehensive experimental analysis on five benchmark datasets confirms the superiority of BARec across both short and elongated sequence contexts. Moreover, theoretical examination and case study offer further insight into the model’s logical processes and interpretability.

查看原文本刊更多论文

基于预训练的双向时间数据增强改进序列推荐

顺序推荐系统是识别暂时用户偏好的组成部分。然而，从简短的用户交互序列中学习的任务提出了一个显著的挑战。数据增强已被确定为增强这些序列信息丰富度的有效策略。传统的增强技术，如道具随机化，可能会破坏固有的时间动态。尽管最近在逆时间顺序伪项目生成方面的进展显示出了希望，但在自然时间顺序环境中评估时，它们可能会引入时间差异。作为回应，我们引入了一种复杂的方法，双向时间数据增强与预训练（BARec）。我们的方法利用双向时间增强和知识增强微调来合成真实的伪先验项目，这些项目保留用户偏好并捕获更深层次的项目语义相关性，从而提高模型的表达能力。我们对五个基准数据集的综合实验分析证实了BARec在短序列和长序列上下文中的优势。此外，通过理论考察和案例研究，可以进一步了解模型的逻辑过程和可解释性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Knowledge and Data Engineering 工程技术-工程：电子与电气

CiteScore

11.70

自引率

3.40%

发文量

515

审稿时长

6 months

期刊介绍： The IEEE Transactions on Knowledge and Data Engineering encompasses knowledge and data engineering aspects within computer science, artificial intelligence, electrical engineering, computer engineering, and related fields. It provides an interdisciplinary platform for disseminating new developments in knowledge and data engineering and explores the practicality of these concepts in both hardware and software. Specific areas covered include knowledge-based and expert systems, AI techniques for knowledge and data management, tools, and methodologies, distributed processing, real-time systems, architectures, data management practices, database design, query languages, security, fault tolerance, statistical databases, algorithms, performance evaluation, and applications.