Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Xiao Zhang, Ming He, Jianping Fan, Jun Xu
{"title":"Enhancing Sequential Recommendations through Multi-Perspective Reflections and Iteration","authors":"Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Xiao Zhang, Ming He, Jianping Fan, Jun Xu","doi":"arxiv-2409.06377","DOIUrl":null,"url":null,"abstract":"Sequence recommendation (SeqRec) aims to predict the next item a user will\ninteract with by understanding user intentions and leveraging collaborative\nfiltering information. Large language models (LLMs) have shown great promise in\nrecommendation tasks through prompt-based, fixed reflection libraries, and\nfine-tuning techniques. However, these methods face challenges, including lack\nof supervision, inability to optimize reflection sources, inflexibility to\ndiverse user needs, and high computational costs. Despite promising results,\ncurrent studies primarily focus on reflections of users' explicit preferences\n(e.g., item titles) while neglecting implicit preferences (e.g., brands) and\ncollaborative filtering information. This oversight hinders the capture of\npreference shifts and dynamic user behaviors. Additionally, existing approaches\nlack mechanisms for reflection evaluation and iteration, often leading to\nsuboptimal recommendations. To address these issues, we propose the Mixture of\nREflectors (MoRE) framework, designed to model and learn dynamic user\npreferences in SeqRec. Specifically, MoRE introduces three reflectors for\ngenerating LLM-based reflections on explicit preferences, implicit preferences,\nand collaborative signals. Each reflector incorporates a self-improving\nstrategy, termed refining-and-iteration, to evaluate and iteratively update\nreflections. Furthermore, a meta-reflector employs a contextual bandit\nalgorithm to select the most suitable expert and corresponding reflections for\neach user's recommendation, effectively capturing dynamic preferences.\nExtensive experiments on three real-world datasets demonstrate that MoRE\nconsistently outperforms state-of-the-art methods, requiring less training time\nand GPU memory compared to other LLM-based approaches in SeqRec.","PeriodicalId":501281,"journal":{"name":"arXiv - CS - Information Retrieval","volume":"117 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.06377","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Sequence recommendation (SeqRec) aims to predict the next item a user will
interact with by understanding user intentions and leveraging collaborative
filtering information. Large language models (LLMs) have shown great promise in
recommendation tasks through prompt-based, fixed reflection libraries, and
fine-tuning techniques. However, these methods face challenges, including lack
of supervision, inability to optimize reflection sources, inflexibility to
diverse user needs, and high computational costs. Despite promising results,
current studies primarily focus on reflections of users' explicit preferences
(e.g., item titles) while neglecting implicit preferences (e.g., brands) and
collaborative filtering information. This oversight hinders the capture of
preference shifts and dynamic user behaviors. Additionally, existing approaches
lack mechanisms for reflection evaluation and iteration, often leading to
suboptimal recommendations. To address these issues, we propose the Mixture of
REflectors (MoRE) framework, designed to model and learn dynamic user
preferences in SeqRec. Specifically, MoRE introduces three reflectors for
generating LLM-based reflections on explicit preferences, implicit preferences,
and collaborative signals. Each reflector incorporates a self-improving
strategy, termed refining-and-iteration, to evaluate and iteratively update
reflections. Furthermore, a meta-reflector employs a contextual bandit
algorithm to select the most suitable expert and corresponding reflections for
each user's recommendation, effectively capturing dynamic preferences.
Extensive experiments on three real-world datasets demonstrate that MoRE
consistently outperforms state-of-the-art methods, requiring less training time
and GPU memory compared to other LLM-based approaches in SeqRec.