Zikun Ye, Zhiqi Zhang, Dennis Zhang, Heng Zhang, Renyu Zhang
{"title":"Deep Learning Based Causal Inference for Large-Scale Combinatorial Experiments: Theory and Empirical Evidence","authors":"Zikun Ye, Zhiqi Zhang, Dennis Zhang, Heng Zhang, Renyu Zhang","doi":"10.1145/3580507.3597718","DOIUrl":null,"url":null,"abstract":"Large-scale online platforms launch hundreds of randomized experiments (a.k.a. A/B tests) every day to iterate their operations and marketing strategies, while the combinations of these treatments are typically not exhaustively tested. It triggers an important question of both academic and practical interests: Without observing the outcomes of all treatment combinations, how to estimate the causal effect of any treatment combination and identify the optimal treatment combination? We develop a novel framework combining deep learning and double machine learning to estimate the causal effect of any treatment combination for each user on the platform when observing only a small subset of treatment combinations. Our proposed framework (called debiased deep learning, DeDL) exploits Neyman orthogonality and combines interpretable and flexible structural layers in deep learning. We prove theoretically that this framework yields consistent and asymptotically normal estimators under mild assumptions, thus allowing for identifying the best treatment combination when only observing a few combinations. To empirically validate our method, we then collaborate with a large-scale video-sharing platform and implement our framework for three experiments involving three treatments where each combination of treatments is tested. When only observing a subset of treatment combinations, our DeDL approach significantly outperforms other benchmarks to accurately estimate and infer the average treatment effect (ATE) of any treatment combination and to identify the optimal treatment combination.","PeriodicalId":210555,"journal":{"name":"Proceedings of the 24th ACM Conference on Economics and Computation","volume":"103 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 24th ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3580507.3597718","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Large-scale online platforms launch hundreds of randomized experiments (a.k.a. A/B tests) every day to iterate their operations and marketing strategies, while the combinations of these treatments are typically not exhaustively tested. It triggers an important question of both academic and practical interests: Without observing the outcomes of all treatment combinations, how to estimate the causal effect of any treatment combination and identify the optimal treatment combination? We develop a novel framework combining deep learning and double machine learning to estimate the causal effect of any treatment combination for each user on the platform when observing only a small subset of treatment combinations. Our proposed framework (called debiased deep learning, DeDL) exploits Neyman orthogonality and combines interpretable and flexible structural layers in deep learning. We prove theoretically that this framework yields consistent and asymptotically normal estimators under mild assumptions, thus allowing for identifying the best treatment combination when only observing a few combinations. To empirically validate our method, we then collaborate with a large-scale video-sharing platform and implement our framework for three experiments involving three treatments where each combination of treatments is tested. When only observing a subset of treatment combinations, our DeDL approach significantly outperforms other benchmarks to accurately estimate and infer the average treatment effect (ATE) of any treatment combination and to identify the optimal treatment combination.