{"title":"RLPS: A Reinforcement Learning–Based Framework for Personalized Search","authors":"Jing Yao, Zhicheng Dou, Jun Xu, Jirong Wen","doi":"10.1145/3446617","DOIUrl":null,"url":null,"abstract":"Personalized search is a promising way to improve search qualities by taking user interests into consideration. Recently, machine learning and deep learning techniques have been successfully applied to search result personalization. Most existing models simply regard the personal search history as a static set of user behaviors and learn fixed ranking strategies based on all the recorded data. Though improvements have been achieved, the essence that the search process is a sequence of interactions between the search engine and user is ignored. The user’s interests may dynamically change during the search process, therefore, it would be more helpful if a personalized search model could track the whole interaction process and adjust its ranking strategy continuously. In this article, we adapt reinforcement learning to personalized search and propose a framework, referred to as RLPS. It utilizes a Markov Decision Process (MDP) to track sequential interactions between the user and search engine, and continuously update the underlying personalized ranking model with the user’s real-time feedback to learn the user’s dynamic interests. Within this framework, we implement two models: the listwise RLPS-L and the hierarchical RLPS-H. RLPS-L interacts with users and trains the ranking model with document lists, while RLPS-H improves model training by designing a layered structure and introducing document pairs. In addition, we also design a feedback-aware personalized ranking component to capture the user’s feedback, which impacts the user interest profile for the next query. Significant improvements over existing personalized search models are observed in the experiments on the public AOL search log and a commercial log.","PeriodicalId":6934,"journal":{"name":"ACM Transactions on Information Systems (TOIS)","volume":"24 1","pages":"1 - 29"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Information Systems (TOIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3446617","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Personalized search is a promising way to improve search qualities by taking user interests into consideration. Recently, machine learning and deep learning techniques have been successfully applied to search result personalization. Most existing models simply regard the personal search history as a static set of user behaviors and learn fixed ranking strategies based on all the recorded data. Though improvements have been achieved, the essence that the search process is a sequence of interactions between the search engine and user is ignored. The user’s interests may dynamically change during the search process, therefore, it would be more helpful if a personalized search model could track the whole interaction process and adjust its ranking strategy continuously. In this article, we adapt reinforcement learning to personalized search and propose a framework, referred to as RLPS. It utilizes a Markov Decision Process (MDP) to track sequential interactions between the user and search engine, and continuously update the underlying personalized ranking model with the user’s real-time feedback to learn the user’s dynamic interests. Within this framework, we implement two models: the listwise RLPS-L and the hierarchical RLPS-H. RLPS-L interacts with users and trains the ranking model with document lists, while RLPS-H improves model training by designing a layered structure and introducing document pairs. In addition, we also design a feedback-aware personalized ranking component to capture the user’s feedback, which impacts the user interest profile for the next query. Significant improvements over existing personalized search models are observed in the experiments on the public AOL search log and a commercial log.