Online Experimentation Diagnosis and Troubleshooting Beyond AA Validation

Zhenyu Zhao, Miao Chen, Don Matheson, M. Stone
{"title":"Online Experimentation Diagnosis and Troubleshooting Beyond AA Validation","authors":"Zhenyu Zhao, Miao Chen, Don Matheson, M. Stone","doi":"10.1109/DSAA.2016.61","DOIUrl":null,"url":null,"abstract":"Online experiments are frequently used at internet companies to evaluate the impact of new designs, features, or code changes on user behavior. Though the experiment design is straightforward in theory, in practice, there are many problems that can complicate the interpretation of results and render any conclusions about changes in user behavior invalid. Many of these problems are difficult to detect and often go unnoticed. Acknowledging and diagnosing these issues can prevent experiment owners from making decisions based on fundamentally flawed data. When conducting online experiments, data quality assurance is a top priority before attributing the impact to changes in user behavior. While some problems can be detected by running AA tests before introducing the treatment, many problems do not emerge during the AA period, and appear only during the AB period. Prior work on this topic has not addressed troubleshooting during the AB period. In this paper, we present lessons learned from experiments on various internet consumer products at Yahoo, as well as diagnostic and remedy procedures. Most of the examples and troubleshooting procedures presented here are generic to online experimentation at other companies. Some, such as traffic splitting problems and outlier problems have been documented before, but others have not previously been described in the literature.","PeriodicalId":193885,"journal":{"name":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"45","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSAA.2016.61","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 45

Abstract

Online experiments are frequently used at internet companies to evaluate the impact of new designs, features, or code changes on user behavior. Though the experiment design is straightforward in theory, in practice, there are many problems that can complicate the interpretation of results and render any conclusions about changes in user behavior invalid. Many of these problems are difficult to detect and often go unnoticed. Acknowledging and diagnosing these issues can prevent experiment owners from making decisions based on fundamentally flawed data. When conducting online experiments, data quality assurance is a top priority before attributing the impact to changes in user behavior. While some problems can be detected by running AA tests before introducing the treatment, many problems do not emerge during the AA period, and appear only during the AB period. Prior work on this topic has not addressed troubleshooting during the AB period. In this paper, we present lessons learned from experiments on various internet consumer products at Yahoo, as well as diagnostic and remedy procedures. Most of the examples and troubleshooting procedures presented here are generic to online experimentation at other companies. Some, such as traffic splitting problems and outlier problems have been documented before, but others have not previously been described in the literature.
在线实验诊断和故障排除超出AA验证
互联网公司经常使用在线实验来评估新设计、功能或代码更改对用户行为的影响。虽然实验设计在理论上是简单明了的,但在实践中,有许多问题会使结果的解释复杂化,并使任何关于用户行为变化的结论无效。这些问题中有许多很难发现,而且经常被忽视。承认和诊断这些问题可以防止实验所有者根据根本有缺陷的数据做出决定。在进行在线实验时,在将影响归因于用户行为的变化之前,数据质量保证是重中之重。虽然一些问题可以通过在引入治疗之前进行AA测试来检测,但许多问题在AA期不会出现,而只在AB期出现。在此主题之前的工作没有解决AB期间的故障排除问题。在本文中,我们介绍了从雅虎各种互联网消费产品的实验中获得的经验教训,以及诊断和补救程序。这里提供的大多数示例和故障排除过程都适用于其他公司的在线实验。一些问题,如交通分流问题和离群值问题,以前已经有文献记载,但其他问题以前没有在文献中描述。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信