无偏学习排序与有偏反馈

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining Pub Date : 2016-08-16 DOI:10.1145/3018661.3018699

T. Joachims, Adith Swaminathan, Tobias Schnabel

{"title":"无偏学习排序与有偏反馈","authors":"T. Joachims, Adith Swaminathan, Tobias Schnabel","doi":"10.1145/3018661.3018699","DOIUrl":null,"url":null,"abstract":"Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data. Using this framework, we derive a Propensity-Weighted Ranking SVM for discriminative learning from implicit feedback, where click models take the role of the propensity estimator. In contrast to most conventional approaches to de-biasing the data using click models, this allows training of ranking functions even in settings where queries do not repeat. Beyond the theoretical support, we show empirically that the proposed learning method is highly effective in dealing with biases, that it is robust to noise and propensity model misspecification, and that it scales efficiently. We also demonstrate the real-world applicability of our approach on an operational search engine, where it substantially improves retrieval performance.","PeriodicalId":344017,"journal":{"name":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","volume":"70 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"447","resultStr":"{\"title\":\"Unbiased Learning-to-Rank with Biased Feedback\",\"authors\":\"T. Joachims, Adith Swaminathan, Tobias Schnabel\",\"doi\":\"10.1145/3018661.3018699\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data. Using this framework, we derive a Propensity-Weighted Ranking SVM for discriminative learning from implicit feedback, where click models take the role of the propensity estimator. In contrast to most conventional approaches to de-biasing the data using click models, this allows training of ranking functions even in settings where queries do not repeat. Beyond the theoretical support, we show empirically that the proposed learning method is highly effective in dealing with biases, that it is robust to noise and propensity model misspecification, and that it scales efficiently. We also demonstrate the real-world applicability of our approach on an operational search engine, where it substantially improves retrieval performance.\",\"PeriodicalId\":344017,\"journal\":{\"name\":\"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining\",\"volume\":\"70 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-08-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"447\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3018661.3018699\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Tenth ACM International Conference on Web Search and Data Mining","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3018661.3018699","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 447

摘要

隐式反馈(例如，点击、停留时间等)是人机交互系统中丰富的数据来源。虽然隐式反馈有许多优点(例如，收集成本低，以用户为中心，及时)，但其固有的偏见是有效使用的主要障碍。例如，搜索排名中的位置偏差强烈影响结果接收的点击次数，因此直接使用点击数据作为学习排名(LTR)方法中的训练信号会产生次优结果。为了克服这种偏差问题，我们提出了一个反事实推理框架，该框架通过有偏数据的经验风险最小化为无偏LTR提供了理论基础。利用这个框架，我们从隐式反馈中推导出一个倾向加权排序支持向量机用于判别学习，其中点击模型扮演倾向估计器的角色。与大多数使用点击模型来消除数据偏差的传统方法相比，这允许在查询不重复的设置中训练排名函数。除了理论支持之外，我们还通过经验证明了所提出的学习方法在处理偏差方面是非常有效的，它对噪声和倾向模型错误规范具有鲁棒性，并且它具有有效的扩展。我们还演示了我们的方法在实际操作搜索引擎上的适用性，它极大地提高了检索性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Unbiased Learning-to-Rank with Biased Feedback

Implicit feedback (e.g., clicks, dwell times, etc.) is an abundant source of data in human-interactive systems. While implicit feedback has many advantages (e.g., it is inexpensive to collect, user centric, and timely), its inherent biases are a key obstacle to its effective use. For example, position bias in search rankings strongly influences how many clicks a result receives, so that directly using click data as a training signal in Learning-to-Rank (LTR) methods yields sub-optimal results. To overcome this bias problem, we present a counterfactual inference framework that provides the theoretical basis for unbiased LTR via Empirical Risk Minimization despite biased data. Using this framework, we derive a Propensity-Weighted Ranking SVM for discriminative learning from implicit feedback, where click models take the role of the propensity estimator. In contrast to most conventional approaches to de-biasing the data using click models, this allows training of ranking functions even in settings where queries do not repeat. Beyond the theoretical support, we show empirically that the proposed learning method is highly effective in dealing with biases, that it is robust to noise and propensity model misspecification, and that it scales efficiently. We also demonstrate the real-world applicability of our approach on an operational search engine, where it substantially improves retrieval performance.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the Tenth ACM International Conference on Web Search and Data Mining

自引率

0.00%

发文量