通过重新编程预训练变换器进行顺序推荐

IF 7.4 1区管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Information Processing & Management Pub Date : 2024-11-01 DOI:10.1016/j.ipm.2024.103938

Min Tang , Shujie Cui , Zhe Jin , Shiuan-ni Liang , Chenliang Li , Lixin Zou

{"title":"通过重新编程预训练变换器进行顺序推荐","authors":"Min Tang , Shujie Cui , Zhe Jin , Shiuan-ni Liang , Chenliang Li , Lixin Zou","doi":"10.1016/j.ipm.2024.103938","DOIUrl":null,"url":null,"abstract":"<div><div>Inspired by the success of Pre-trained language models (PLMs), numerous sequential recommenders attempted to replicate its achievements by employing PLMs’ efficient architectures for building large models and using self-supervised learning for broadening training data. Despite their success, there is curiosity about developing a large-scale sequential recommender system since existing methods either build models within a single dataset or utilize text as an intermediary for alignment across different datasets. However, due to the sparsity of user–item interactions, unalignment between different datasets, and lack of global information in the sequential recommendation, directly pre-training a large foundation model may not be feasible.</div><div>Towards this end, we propose the <span>RecPPT</span> that firstly employs the GPT-2 to model historical sequence by training the input item embedding and the output layer from scratch, which avoids training a large model on the sparse user–item interactions. Additionally, to alleviate the burden of unalignment, the <span>RecPPT</span> is equipped with a reprogramming module to reprogram the target embedding to existing well-trained proto-embeddings. Furthermore, <span>RecPPT</span> integrates global information into sequences by initializing the item embedding using an SVD-based initializer. Extensive experiments over four datasets demonstrated the <span>RecPPT</span> achieved an average improvement of 6.5% on NDCG@5, 6.2% on NDCG@10, 6.1% on Recall@5, and 5.4% on Recall@10 compared to the baselines. Particularly in few-shot scenarios, the significant improvements in NDCG@10 confirm the superiority of the proposed method.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103938"},"PeriodicalIF":7.4000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Sequential recommendation by reprogramming pretrained transformer\",\"authors\":\"Min Tang , Shujie Cui , Zhe Jin , Shiuan-ni Liang , Chenliang Li , Lixin Zou\",\"doi\":\"10.1016/j.ipm.2024.103938\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Inspired by the success of Pre-trained language models (PLMs), numerous sequential recommenders attempted to replicate its achievements by employing PLMs’ efficient architectures for building large models and using self-supervised learning for broadening training data. Despite their success, there is curiosity about developing a large-scale sequential recommender system since existing methods either build models within a single dataset or utilize text as an intermediary for alignment across different datasets. However, due to the sparsity of user–item interactions, unalignment between different datasets, and lack of global information in the sequential recommendation, directly pre-training a large foundation model may not be feasible.</div><div>Towards this end, we propose the <span>RecPPT</span> that firstly employs the GPT-2 to model historical sequence by training the input item embedding and the output layer from scratch, which avoids training a large model on the sparse user–item interactions. Additionally, to alleviate the burden of unalignment, the <span>RecPPT</span> is equipped with a reprogramming module to reprogram the target embedding to existing well-trained proto-embeddings. Furthermore, <span>RecPPT</span> integrates global information into sequences by initializing the item embedding using an SVD-based initializer. Extensive experiments over four datasets demonstrated the <span>RecPPT</span> achieved an average improvement of 6.5% on NDCG@5, 6.2% on NDCG@10, 6.1% on Recall@5, and 5.4% on Recall@10 compared to the baselines. Particularly in few-shot scenarios, the significant improvements in NDCG@10 confirm the superiority of the proposed method.</div></div>\",\"PeriodicalId\":50365,\"journal\":{\"name\":\"Information Processing & Management\",\"volume\":\"62 1\",\"pages\":\"Article 103938\"},\"PeriodicalIF\":7.4000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Information Processing & Management\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0306457324002978\",\"RegionNum\":1,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Information Processing & Management","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306457324002978","RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

摘要

受到预训练语言模型（PLMs）成功的启发，许多顺序推荐系统试图复制其成就，方法是采用预训练语言模型的高效架构来构建大型模型，并利用自监督学习来扩大训练数据。尽管取得了成功，但人们对开发大规模顺序推荐系统仍充满好奇，因为现有的方法要么是在单个数据集内建立模型，要么是利用文本作为中介在不同数据集之间进行排列。为此，我们提出了 RecPPT，首先利用 GPT-2 建立历史序列模型，从头开始训练输入项嵌入和输出层，从而避免在稀疏的用户-项交互上训练大型模型。此外，为了减轻不对齐的负担，RecPPT 还配备了一个重新编程模块，可根据现有训练有素的原嵌入对目标嵌入进行重新编程。此外，RecPPT 还使用基于 SVD 的初始化器初始化项目嵌入，从而将全局信息整合到序列中。在四个数据集上进行的广泛实验表明，与基线相比，RecPPT 在 NDCG@5 上平均提高了 6.5%，在 NDCG@10 上平均提高了 6.2%，在 Recall@5 上平均提高了 6.1%，在 Recall@10 上平均提高了 5.4%。特别是在拍摄次数较少的情况下，NDCG@10 的显著提高证实了所提方法的优越性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Sequential recommendation by reprogramming pretrained transformer

Inspired by the success of Pre-trained language models (PLMs), numerous sequential recommenders attempted to replicate its achievements by employing PLMs’ efficient architectures for building large models and using self-supervised learning for broadening training data. Despite their success, there is curiosity about developing a large-scale sequential recommender system since existing methods either build models within a single dataset or utilize text as an intermediary for alignment across different datasets. However, due to the sparsity of user–item interactions, unalignment between different datasets, and lack of global information in the sequential recommendation, directly pre-training a large foundation model may not be feasible.

Towards this end, we propose the RecPPT that firstly employs the GPT-2 to model historical sequence by training the input item embedding and the output layer from scratch, which avoids training a large model on the sparse user–item interactions. Additionally, to alleviate the burden of unalignment, the RecPPT is equipped with a reprogramming module to reprogram the target embedding to existing well-trained proto-embeddings. Furthermore, RecPPT integrates global information into sequences by initializing the item embedding using an SVD-based initializer. Extensive experiments over four datasets demonstrated the RecPPT achieved an average improvement of 6.5% on NDCG@5, 6.2% on NDCG@10, 6.1% on Recall@5, and 5.4% on Recall@10 compared to the baselines. Particularly in few-shot scenarios, the significant improvements in NDCG@10 confirm the superiority of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Information Processing & Management 工程技术-计算机：信息系统

CiteScore

17.00

自引率

11.60%

发文量

276

审稿时长

39 days

期刊介绍： Information Processing and Management is dedicated to publishing cutting-edge original research at the convergence of computing and information science. Our scope encompasses theory, methods, and applications across various domains, including advertising, business, health, information science, information technology marketing, and social computing. We aim to cater to the interests of both primary researchers and practitioners by offering an effective platform for the timely dissemination of advanced and topical issues in this interdisciplinary field. The journal places particular emphasis on original research articles, research survey articles, research method articles, and articles addressing critical applications of research. Join us in advancing knowledge and innovation at the intersection of computing and information science.