RLPer: A Reinforcement Learning Model for Personalized Search

Proceedings of The Web Conference 2020 Pub Date : 2020-04-20 DOI:10.1145/3366423.3380294

Jing Yao, Zhicheng Dou, Jun Xu, Ji-rong Wen

{"title":"RLPer: A Reinforcement Learning Model for Personalized Search","authors":"Jing Yao, Zhicheng Dou, Jun Xu, Ji-rong Wen","doi":"10.1145/3366423.3380294","DOIUrl":null,"url":null,"abstract":"Personalized search improves generic ranking models by taking user interests into consideration and returning more accurate search results to individual users. In recent years, machine learning and deep learning techniques have been successfully applied in personalized search. Most existing personalization models simply regard the search history as a static set of user behaviours and learn fixed ranking strategies based on the recorded data. Though improvements have been observed, it is obvious that these methods ignore the dynamic nature of the search process: search is a sequence of interactions between the search engine and the user. During the search process, the user interests may dynamically change. It would be more helpful if a personalized search model could track the whole interaction process and update its ranking strategy continuously. In this paper, we propose a reinforcement learning based personalization model, referred to as RLPer, to track the sequential interactions between the users and search engine with a hierarchical Markov Decision Process (MDP). In RLPer, the search engine interacts with the user to update the underlying ranking model continuously with real-time feedback. And we design a feedback-aware personalized ranking component to catch the user’s feedback which has impacts on the user interest profile for the next query. Experimental results on the publicly available AOL search log verify that our proposed model can significantly outperform state-of-the-art personalized search models.","PeriodicalId":20754,"journal":{"name":"Proceedings of The Web Conference 2020","volume":"87 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2020-04-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of The Web Conference 2020","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366423.3380294","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

Abstract

Personalized search improves generic ranking models by taking user interests into consideration and returning more accurate search results to individual users. In recent years, machine learning and deep learning techniques have been successfully applied in personalized search. Most existing personalization models simply regard the search history as a static set of user behaviours and learn fixed ranking strategies based on the recorded data. Though improvements have been observed, it is obvious that these methods ignore the dynamic nature of the search process: search is a sequence of interactions between the search engine and the user. During the search process, the user interests may dynamically change. It would be more helpful if a personalized search model could track the whole interaction process and update its ranking strategy continuously. In this paper, we propose a reinforcement learning based personalization model, referred to as RLPer, to track the sequential interactions between the users and search engine with a hierarchical Markov Decision Process (MDP). In RLPer, the search engine interacts with the user to update the underlying ranking model continuously with real-time feedback. And we design a feedback-aware personalized ranking component to catch the user’s feedback which has impacts on the user interest profile for the next query. Experimental results on the publicly available AOL search log verify that our proposed model can significantly outperform state-of-the-art personalized search models.

查看原文本刊更多论文

RLPer:个性化搜索的强化学习模型

个性化搜索通过考虑用户兴趣并向单个用户返回更准确的搜索结果来改进通用排名模型。近年来，机器学习和深度学习技术已成功应用于个性化搜索。大多数现有的个性化模型只是将搜索历史视为静态的用户行为集合，并根据记录的数据学习固定的排名策略。虽然已经观察到改进，但很明显，这些方法忽略了搜索过程的动态特性:搜索是搜索引擎和用户之间的一系列交互。在搜索过程中，用户的兴趣可能会发生动态变化。如果个性化搜索模型能够跟踪整个交互过程并不断更新其排名策略，将会更有帮助。在本文中，我们提出了一种基于强化学习的个性化模型(RLPer)，该模型使用分层马尔可夫决策过程(MDP)来跟踪用户与搜索引擎之间的顺序交互。在RLPer中，搜索引擎与用户交互，通过实时反馈不断更新底层排名模型。我们设计了一个反馈感知的个性化排名组件来捕捉用户的反馈，这些反馈会影响用户对下一个查询的兴趣。在公开可用的AOL搜索日志上的实验结果证实，我们提出的模型可以显著优于最先进的个性化搜索模型。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of The Web Conference 2020

自引率

0.00%

发文量