Taking the Counterfactual Online: Efficient and Unbiased Online Evaluation for Ranking

Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval Pub Date : 2020-07-24 DOI:10.1145/3409256.3409820

Harrie Oosterhuis, M. de Rijke

{"title":"Taking the Counterfactual Online: Efficient and Unbiased Online Evaluation for Ranking","authors":"Harrie Oosterhuis, M. de Rijke","doi":"10.1145/3409256.3409820","DOIUrl":null,"url":null,"abstract":"Counterfactual evaluation can estimate Click-Through-Rate (CTR) differences between ranking systems based on historical interaction data, while mitigating the effect of position bias and item-selection bias. We introduce the novel Logging-Policy Optimization Algorithm (LogOpt), which optimizes the policy for logging data so that the counterfactual estimate has minimal variance. As minimizing variance leads to faster convergence, LogOpt increases the data-efficiency of counterfactual estimation. LogOpt turns the counterfactual approach - which is indifferent to the logging policy - into an online approach, where the algorithm decides what rankings to display. We prove that, as an online evaluation method, LogOpt is unbiased w.r.t. position and item-selection bias, unlike existing interleaving methods. Furthermore, we perform large-scale experiments by simulating comparisons between thousands of rankers. Our results show that while interleaving methods make systematic errors, LogOpt is as efficient as interleaving without being biased.","PeriodicalId":430907,"journal":{"name":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","volume":"106 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3409256.3409820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 17

Abstract

Counterfactual evaluation can estimate Click-Through-Rate (CTR) differences between ranking systems based on historical interaction data, while mitigating the effect of position bias and item-selection bias. We introduce the novel Logging-Policy Optimization Algorithm (LogOpt), which optimizes the policy for logging data so that the counterfactual estimate has minimal variance. As minimizing variance leads to faster convergence, LogOpt increases the data-efficiency of counterfactual estimation. LogOpt turns the counterfactual approach - which is indifferent to the logging policy - into an online approach, where the algorithm decides what rankings to display. We prove that, as an online evaluation method, LogOpt is unbiased w.r.t. position and item-selection bias, unlike existing interleaving methods. Furthermore, we perform large-scale experiments by simulating comparisons between thousands of rankers. Our results show that while interleaving methods make systematic errors, LogOpt is as efficient as interleaving without being biased.

查看原文本刊更多论文

反事实在线:高效、公正的在线排名评价

反事实评价可以根据历史交互数据估计排名系统之间的点击率差异，同时减轻位置偏见和项目选择偏见的影响。本文介绍了一种新的记录策略优化算法(LogOpt)，该算法对记录数据的策略进行优化，使反事实估计具有最小的方差。由于最小化方差导致更快的收敛，LogOpt提高了反事实估计的数据效率。LogOpt将反事实方法(与日志记录策略无关)转变为在线方法，由算法决定显示什么排名。我们证明了LogOpt作为一种在线评价方法，与现有的交错评价方法不同，它在位置和项目选择偏差上是无偏的。此外，我们通过模拟数千个排序器之间的比较来进行大规模实验。我们的结果表明，虽然交错方法会产生系统误差，但LogOpt与交错方法一样有效，而且不会产生偏差。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2020 ACM SIGIR on International Conference on Theory of Information Retrieval

自引率

0.00%

发文量