Network A/B Testing: Nonparametric Statistical Significance Test Based on Cluster-Level Permutation

Hongwei Shang, Xiaolin Shi, Bai Jiang
{"title":"Network A/B Testing: Nonparametric Statistical Significance Test Based on Cluster-Level Permutation","authors":"Hongwei Shang, Xiaolin Shi, Bai Jiang","doi":"10.6339/23-jds1112","DOIUrl":null,"url":null,"abstract":"A/B testing is widely used for comparing two versions of a product and evaluating new proposed product features. It is of great importance for decision-making and has been applied as a golden standard in the IT industry. It is essentially a form of two-sample statistical hypothesis testing. Average treatment effect (ATE) and the corresponding p-value can be obtained under certain assumptions. One key assumption in traditional A/B testing is the stable-unit-treatment-value assumption (SUTVA): there is no interference among different units. It means that the observation on one unit is unaffected by the particular assignment of treatments to the other units. Nonetheless, interference is very common in social network settings where people communicate and spread information to their neighbors. Therefore, the SUTVA assumption is violated. Analysis ignoring this network effect will lead to biased estimation of ATE. Most existing works focus mainly on the design of experiment and data analysis in order to produce estimators with good performance in regards to bias and variance. Little attention has been paid to the calculation of p-value. We work on the calculation of p-value for the ATE estimator in network A/B tests. After a brief review of existing research methods on design of experiment based on graph cluster randomization and different ATE estimation methods, we propose a permutation method for calculating p-value based on permutation test at the cluster level. The effectiveness of the method against that based on individual-level permutation is validated in a simulation study mimicking realistic settings.","PeriodicalId":73699,"journal":{"name":"Journal of data science : JDS","volume":"1 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of data science : JDS","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.6339/23-jds1112","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

A/B testing is widely used for comparing two versions of a product and evaluating new proposed product features. It is of great importance for decision-making and has been applied as a golden standard in the IT industry. It is essentially a form of two-sample statistical hypothesis testing. Average treatment effect (ATE) and the corresponding p-value can be obtained under certain assumptions. One key assumption in traditional A/B testing is the stable-unit-treatment-value assumption (SUTVA): there is no interference among different units. It means that the observation on one unit is unaffected by the particular assignment of treatments to the other units. Nonetheless, interference is very common in social network settings where people communicate and spread information to their neighbors. Therefore, the SUTVA assumption is violated. Analysis ignoring this network effect will lead to biased estimation of ATE. Most existing works focus mainly on the design of experiment and data analysis in order to produce estimators with good performance in regards to bias and variance. Little attention has been paid to the calculation of p-value. We work on the calculation of p-value for the ATE estimator in network A/B tests. After a brief review of existing research methods on design of experiment based on graph cluster randomization and different ATE estimation methods, we propose a permutation method for calculating p-value based on permutation test at the cluster level. The effectiveness of the method against that based on individual-level permutation is validated in a simulation study mimicking realistic settings.
网络A/B测试:基于集群水平排列的非参数统计显著性检验
A/B测试广泛用于比较产品的两个版本和评估新提出的产品功能。它对决策具有重要意义,并已成为It行业的黄金标准。它本质上是一种双样本统计假设检验。在一定的假设条件下,可以得到平均处理效果(ATE)及其对应的p值。传统A/B测试中的一个关键假设是稳定单元处理值假设(SUTVA):不同单元之间不存在干扰。这意味着对一个单元的观察不受对其他单元的特殊处理分配的影响。尽管如此,在人们与邻居交流和传播信息的社交网络环境中,干扰是非常常见的。因此,违反了SUTVA假设。忽略这种网络效应的分析将导致ATE估计的偏差。大多数现有的工作主要集中在实验设计和数据分析上,以产生在偏差和方差方面具有良好性能的估计器。p值的计算很少受到重视。我们研究了网络A/B测试中ATE估计器的p值的计算。在简要回顾现有基于图类随机化的实验设计研究方法和不同ATE估计方法的基础上,提出了一种基于聚类水平置换检验的p值计算置换方法。在模拟现实环境的仿真研究中,验证了该方法对基于个体水平排列的方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信