Estimating Long-term Effects from Experimental Data

Proceedings of the 16th ACM Conference on Recommender Systems Pub Date : 2022-09-18 DOI:10.1145/3523227.3547398

Ziyang Tang, Yiheng Duan, Steven H. Zhu, Stephanie S. Zhang, Lihong Li

引用次数: 1

Abstract

A/B testing is a powerful tool for a company to make informed decisions about their services and products. A limitation of A/B tests is that they do not easily extend to measure post-experiment (long-term) differences. In this talk, we study a different approach inspired by recent advances in off-policy evaluation in reinforcement learning (RL). The basic RL approach assumes customer behavior follows a stationary Markovian process, and estimates the average engagement metric when the process reaches the steady state. However, in realistic scenarios, the stationary assumption is often violated due to weekly variations and seasonality effects. To tackle this challenge, we propose a variation by relaxing the stationary assumption. We empirically tested both stationary and nonstationary approaches in a synthetic dataset and an online store dataset.

查看原文本刊更多论文

从实验数据估计长期影响

A/B测试是公司对其服务和产品做出明智决策的强大工具。A/B测试的一个限制是，它们不容易扩展到测量实验后(长期)差异。在这次演讲中，我们研究了一种不同的方法，这种方法受到了强化学习(RL)中非政策评估的最新进展的启发。基本的强化学习方法假设客户行为遵循一个平稳的马尔可夫过程，并在该过程达到稳定状态时估计平均粘性指标。然而，在现实情况下，由于每周变化和季节性影响，通常违反平稳假设。为了解决这一挑战，我们通过放松平稳假设提出了一种变化。我们在合成数据集和在线商店数据集中对平稳和非平稳方法进行了实证测试。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 16th ACM Conference on Recommender Systems

自引率

0.00%

发文量

文献相关原料

公司名称	产品信息	采购帮参考价格