强化学习排序

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining Pub Date : 2019-01-30 DOI:10.1145/3289600.3291605

M. de Rijke

{"title":"强化学习排序","authors":"M. de Rijke","doi":"10.1145/3289600.3291605","DOIUrl":null,"url":null,"abstract":"Interactive systems such as search engines or recommender systems are increasingly moving away from single-turn exchanges with users. Instead, series of exchanges between the user and the system are becoming mainstream, especially when users have complex needs or when the system struggles to understand the user's intent. Standard machine learning has helped us a lot in the single-turn paradigm, where we use it to predict: intent, relevance, user satisfaction, etc. When we think of search or recommendation as a series of exchanges, we need to turn to bandit algorithms to determine which action the system should take next, or to reinforcement learning to determine not just the next action but also to plan future actions and estimate their potential pay-off. The use of reinforcement learning for search and recommendations comes with a number of challenges, because of the very large action spaces, the large number of potential contexts, and noisy feedback signals characteristic for this domain. This presentation will survey some recent success stories of reinforcement learning for search, recommendation, and conversations; and will identify promising future research directions for reinforcement learning for search and recommendation.","PeriodicalId":20567,"journal":{"name":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","volume":"12 1","pages":"5"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Reinforcement Learning to Rank\",\"authors\":\"M. de Rijke\",\"doi\":\"10.1145/3289600.3291605\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Interactive systems such as search engines or recommender systems are increasingly moving away from single-turn exchanges with users. Instead, series of exchanges between the user and the system are becoming mainstream, especially when users have complex needs or when the system struggles to understand the user's intent. Standard machine learning has helped us a lot in the single-turn paradigm, where we use it to predict: intent, relevance, user satisfaction, etc. When we think of search or recommendation as a series of exchanges, we need to turn to bandit algorithms to determine which action the system should take next, or to reinforcement learning to determine not just the next action but also to plan future actions and estimate their potential pay-off. The use of reinforcement learning for search and recommendations comes with a number of challenges, because of the very large action spaces, the large number of potential contexts, and noisy feedback signals characteristic for this domain. This presentation will survey some recent success stories of reinforcement learning for search, recommendation, and conversations; and will identify promising future research directions for reinforcement learning for search and recommendation.\",\"PeriodicalId\":20567,\"journal\":{\"name\":\"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining\",\"volume\":\"12 1\",\"pages\":\"5\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3289600.3291605\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Ninth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3289600.3291605","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

摘要

诸如搜索引擎或推荐系统之类的交互式系统正逐渐远离与用户的单轮交换。相反，用户和系统之间的一系列交流正在成为主流，特别是当用户有复杂的需求或系统难以理解用户的意图时。标准机器学习在单回合模式中帮助了我们很多，我们用它来预测:意图、相关性、用户满意度等。当我们认为搜索或推荐是一系列的交流时，我们需要求助于强盗算法来确定系统下一步应该采取什么行动，或者求助于强化学习，不仅要确定下一步行动，还要计划未来的行动，并估计它们的潜在回报。在搜索和推荐中使用强化学习带来了许多挑战，因为这个领域有非常大的动作空间、大量的潜在上下文和噪声反馈信号特征。本演讲将调查一些最近在搜索、推荐和对话方面的强化学习的成功案例;并将为搜索和推荐的强化学习确定有希望的未来研究方向。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Reinforcement Learning to Rank

Interactive systems such as search engines or recommender systems are increasingly moving away from single-turn exchanges with users. Instead, series of exchanges between the user and the system are becoming mainstream, especially when users have complex needs or when the system struggles to understand the user's intent. Standard machine learning has helped us a lot in the single-turn paradigm, where we use it to predict: intent, relevance, user satisfaction, etc. When we think of search or recommendation as a series of exchanges, we need to turn to bandit algorithms to determine which action the system should take next, or to reinforcement learning to determine not just the next action but also to plan future actions and estimate their potential pay-off. The use of reinforcement learning for search and recommendations comes with a number of challenges, because of the very large action spaces, the large number of potential contexts, and noisy feedback signals characteristic for this domain. This presentation will survey some recent success stories of reinforcement learning for search, recommendation, and conversations; and will identify promising future research directions for reinforcement learning for search and recommendation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Ninth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量