联合在线学习与进化策略排序

E. Kharitonov
{"title":"联合在线学习与进化策略排序","authors":"E. Kharitonov","doi":"10.1145/3289600.3290968","DOIUrl":null,"url":null,"abstract":"Online Learning to Rank is a powerful paradigm that allows to train ranking models using only online feedback from its users.In this work, we consider Federated Online Learning to Rank setup (FOLtR) where on-mobile ranking models are trained in a way that respects the users' privacy. We require that the user data, such as queries, results, and their feature representations are never communicated for the purpose of the ranker's training. We believe this setup is interesting, as it combines unique requirements for the learning algorithm: (a) preserving the user privacy, (b) low communication and computation costs, (c) learning from noisy bandit feedback, and (d) learning with non-continuous ranking quality measures. We propose a learning algorithm FOLtR-ES that satisfies these requirements. A part of FOLtR-ES is a privatization procedure that allows it to provide ε-local differential privacy guarantees, i.e. protecting the clients from an adversary who has access to the communicated messages. This procedure can be applied to any absolute online metric that takes finitely many values or can be discretized to a finite domain. Our experimental study is based on a widely used click simulation approach and publicly available learning to rank datasets MQ2007 and MQ2008. We evaluate FOLtR-ES against offline baselines that are trained using relevance labels, linear regression model and RankingSVM. From our experiments, we observe that FOLtR-ES can optimize a ranking model to perform similarly to the baselines in terms of the optimized online metric, Max Reciprocal Rank.","PeriodicalId":143253,"journal":{"name":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Federated Online Learning to Rank with Evolution Strategies\",\"authors\":\"E. Kharitonov\",\"doi\":\"10.1145/3289600.3290968\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Online Learning to Rank is a powerful paradigm that allows to train ranking models using only online feedback from its users.In this work, we consider Federated Online Learning to Rank setup (FOLtR) where on-mobile ranking models are trained in a way that respects the users' privacy. We require that the user data, such as queries, results, and their feature representations are never communicated for the purpose of the ranker's training. We believe this setup is interesting, as it combines unique requirements for the learning algorithm: (a) preserving the user privacy, (b) low communication and computation costs, (c) learning from noisy bandit feedback, and (d) learning with non-continuous ranking quality measures. We propose a learning algorithm FOLtR-ES that satisfies these requirements. A part of FOLtR-ES is a privatization procedure that allows it to provide ε-local differential privacy guarantees, i.e. protecting the clients from an adversary who has access to the communicated messages. This procedure can be applied to any absolute online metric that takes finitely many values or can be discretized to a finite domain. Our experimental study is based on a widely used click simulation approach and publicly available learning to rank datasets MQ2007 and MQ2008. We evaluate FOLtR-ES against offline baselines that are trained using relevance labels, linear regression model and RankingSVM. From our experiments, we observe that FOLtR-ES can optimize a ranking model to perform similarly to the baselines in terms of the optimized online metric, Max Reciprocal Rank.\",\"PeriodicalId\":143253,\"journal\":{\"name\":\"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3289600.3290968\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twelfth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3289600.3290968","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 24

摘要

在线学习排名是一个强大的范例,它允许只使用用户的在线反馈来训练排名模型。在这项工作中,我们考虑了联邦在线学习排名设置(FOLtR),其中移动排名模型以尊重用户隐私的方式进行训练。我们要求用户数据,如查询、结果和它们的特征表示永远不会被传达给排名员的训练目的。我们认为这种设置很有趣,因为它结合了学习算法的独特要求:(a)保护用户隐私,(b)低通信和计算成本,(c)从嘈杂的强盗反馈中学习,以及(d)使用非连续排名质量度量进行学习。我们提出了一种满足这些要求的学习算法FOLtR-ES。FOLtR-ES的一部分是私有化程序,允许它提供ε-local差分隐私保证,即保护客户端免受可以访问通信消息的对手的侵害。这个程序可以应用于任何绝对在线度量,它取有限多个值或可以离散到一个有限域。我们的实验研究基于广泛使用的点击模拟方法和公开可用的学习来对MQ2007和MQ2008数据集进行排名。我们根据使用相关标签、线性回归模型和RankingSVM训练的离线基线评估FOLtR-ES。从我们的实验中,我们观察到FOLtR-ES可以优化排名模型,使其在优化的在线指标Max Reciprocal Rank方面的表现与基线相似。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Federated Online Learning to Rank with Evolution Strategies
Online Learning to Rank is a powerful paradigm that allows to train ranking models using only online feedback from its users.In this work, we consider Federated Online Learning to Rank setup (FOLtR) where on-mobile ranking models are trained in a way that respects the users' privacy. We require that the user data, such as queries, results, and their feature representations are never communicated for the purpose of the ranker's training. We believe this setup is interesting, as it combines unique requirements for the learning algorithm: (a) preserving the user privacy, (b) low communication and computation costs, (c) learning from noisy bandit feedback, and (d) learning with non-continuous ranking quality measures. We propose a learning algorithm FOLtR-ES that satisfies these requirements. A part of FOLtR-ES is a privatization procedure that allows it to provide ε-local differential privacy guarantees, i.e. protecting the clients from an adversary who has access to the communicated messages. This procedure can be applied to any absolute online metric that takes finitely many values or can be discretized to a finite domain. Our experimental study is based on a widely used click simulation approach and publicly available learning to rank datasets MQ2007 and MQ2008. We evaluate FOLtR-ES against offline baselines that are trained using relevance labels, linear regression model and RankingSVM. From our experiments, we observe that FOLtR-ES can optimize a ranking model to perform similarly to the baselines in terms of the optimized online metric, Max Reciprocal Rank.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信