Multi-Scale Transformers with dual attention and adaptive masking for sequential recommendation

IF 6.9 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2025-07-31 DOI:10.1016/j.ipm.2025.104318

Haiqin Li , Yuhan Yang , Jun Zeng , Min Gao , Junhao Wen

{"title":"Multi-Scale Transformers with dual attention and adaptive masking for sequential recommendation","authors":"Haiqin Li , Yuhan Yang , Jun Zeng , Min Gao , Junhao Wen","doi":"10.1016/j.ipm.2025.104318","DOIUrl":null,"url":null,"abstract":"<div><div>Sequential recommendation focuses on modeling and predicting a user’s next actions based on their sequential behavior patterns, using the temporal order and dynamics of user actions to provide more personalized and contextual suggestions. Sequential recommendation models rely on limited temporal scales, making it challenging to explicitly capture diverse user behaviors spanning multiple scales. Motivated by this challenge, this paper introduces ScaleRec, an advanced Multi-Scale Transformer architecture augmented with dual attention mechanisms and adaptive masking for sequential recommendation. ScaleRec integrates interaction granularity and context through multi-scale division, segmenting user behavior sequences into patches of varying lengths. Dual attention explicitly models fine-grained interests and coarse-grained preferences, including intra-patch cross-attention and inter-patch self-attention. Specifically, intra-patch cross-attention employs a learnable Gaussian kernel to introduce locality-based inductive biases, capturing fine-grained behavioral dynamics. The inter-patch self-attention is further enhanced by a Context-adaptive Preferences Aggregator, which dynamically selects and integrates relevant long-term user preferences. Additionally, we introduce an adaptive masking fusion strategy to filter redundant information dynamically. Extensive experiments on six benchmark datasets show that ScaleRec achieves state-of-the-art performance, improving the recommendation performance by up to 24.95% in terms of HR@5. The code of the proposed model is available at: <span><span>https://github.com/gangtann/ScaleRec</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"63 1","pages":"Article 104318"},"PeriodicalIF":6.9000,"publicationDate":"2025-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457325002596","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Sequential recommendation focuses on modeling and predicting a user’s next actions based on their sequential behavior patterns, using the temporal order and dynamics of user actions to provide more personalized and contextual suggestions. Sequential recommendation models rely on limited temporal scales, making it challenging to explicitly capture diverse user behaviors spanning multiple scales. Motivated by this challenge, this paper introduces ScaleRec, an advanced Multi-Scale Transformer architecture augmented with dual attention mechanisms and adaptive masking for sequential recommendation. ScaleRec integrates interaction granularity and context through multi-scale division, segmenting user behavior sequences into patches of varying lengths. Dual attention explicitly models fine-grained interests and coarse-grained preferences, including intra-patch cross-attention and inter-patch self-attention. Specifically, intra-patch cross-attention employs a learnable Gaussian kernel to introduce locality-based inductive biases, capturing fine-grained behavioral dynamics. The inter-patch self-attention is further enhanced by a Context-adaptive Preferences Aggregator, which dynamically selects and integrates relevant long-term user preferences. Additionally, we introduce an adaptive masking fusion strategy to filter redundant information dynamically. Extensive experiments on six benchmark datasets show that ScaleRec achieves state-of-the-art performance, improving the recommendation performance by up to 24.95% in terms of HR@5. The code of the proposed model is available at: https://github.com/gangtann/ScaleRec.

查看原文本刊更多论文

具有双注意和自适应屏蔽顺序推荐的多尺度变压器

顺序推荐侧重于基于用户的顺序行为模式建模和预测用户的下一个操作，使用用户操作的时间顺序和动态来提供更加个性化和上下文相关的建议。顺序推荐模型依赖于有限的时间尺度，这使得显式捕获跨多个尺度的不同用户行为具有挑战性。基于这一挑战，本文介绍了ScaleRec，一种先进的多尺度变压器架构，增强了双注意机制和自适应屏蔽顺序推荐。ScaleRec通过多尺度分割集成交互粒度和上下文，将用户行为序列分割成不同长度的补丁。双重注意明确地模拟了细粒度兴趣和粗粒度偏好，包括补丁内交叉注意和补丁间自注意。具体来说，斑块内交叉注意采用可学习的高斯核引入基于位置的归纳偏差，捕获细粒度的行为动态。通过上下文自适应偏好聚合器进一步增强补丁间的自关注，动态选择和集成相关的长期用户偏好。此外，我们还引入了一种自适应掩蔽融合策略来动态过滤冗余信息。在六个基准数据集上进行的大量实验表明，ScaleRec达到了最先进的性能，以HR@5为例，其推荐性能提高了24.95%。所建议的模型的代码可在：https://github.com/gangtann/ScaleRec上获得。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.