Covariate balancing based on kernel density estimates for controlled experiments

IF 1.3 Q3 STATISTICS & PROBABILITY

Statistical Theory and Related Fields Pub Date : 2020-08-12 DOI:10.1080/24754269.2021.1878742

Yiou Li, Lulu Kang, Xiao Huang

{"title":"Covariate balancing based on kernel density estimates for controlled experiments","authors":"Yiou Li, Lulu Kang, Xiao Huang","doi":"10.1080/24754269.2021.1878742","DOIUrl":null,"url":null,"abstract":"ABSTRACT Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes. A completely randomised design is usually used to randomly assign treatment levels to experimental units. When covariates of the experimental units are available, the experimental design should achieve covariate balancing among the treatment groups, such that the statistical inference of the treatment effects is not confounded with any possible effects of covariates. However, covariate imbalance often exists, because the experiment is carried out based on a single realisation of the complete randomisation. It is more likely to occur and worsen when the size of the experimental units is small or moderate. In this paper, we introduce a new covariate balancing criterion, which measures the differences between kernel density estimates of the covariates of treatment groups. To achieve covariate balance before the treatments are randomly assigned, we partition the experimental units by minimising the criterion, then randomly assign the treatment levels to the partitioned groups. Through numerical examples, we show that the proposed partition approach can improve the accuracy of the difference-in-mean estimator and outperforms the complete randomisation and rerandomisation approaches.","PeriodicalId":22070,"journal":{"name":"Statistical Theory and Related Fields","volume":"5 1","pages":"102 - 113"},"PeriodicalIF":1.3000,"publicationDate":"2020-08-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1080/24754269.2021.1878742","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Statistical Theory and Related Fields","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.1080/24754269.2021.1878742","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 2

Abstract

ABSTRACT Controlled experiments are widely used in many applications to investigate the causal relationship between input factors and experimental outcomes. A completely randomised design is usually used to randomly assign treatment levels to experimental units. When covariates of the experimental units are available, the experimental design should achieve covariate balancing among the treatment groups, such that the statistical inference of the treatment effects is not confounded with any possible effects of covariates. However, covariate imbalance often exists, because the experiment is carried out based on a single realisation of the complete randomisation. It is more likely to occur and worsen when the size of the experimental units is small or moderate. In this paper, we introduce a new covariate balancing criterion, which measures the differences between kernel density estimates of the covariates of treatment groups. To achieve covariate balance before the treatments are randomly assigned, we partition the experimental units by minimising the criterion, then randomly assign the treatment levels to the partitioned groups. Through numerical examples, we show that the proposed partition approach can improve the accuracy of the difference-in-mean estimator and outperforms the complete randomisation and rerandomisation approaches.

查看原文本刊更多论文

基于核密度估计的受控实验协变量平衡

摘要对照实验被广泛用于研究输入因素与实验结果之间的因果关系。完全随机设计通常用于将处理水平随机分配给实验单位。当实验单元的协变量可用时，实验设计应在治疗组之间实现协变量平衡，这样治疗效果的统计推断不会与任何可能的协变量影响相混淆。然而，协变量不平衡经常存在，因为实验是基于完全随机化的单一实现进行的。当实验单位规模较小或中等时，更容易发生和恶化。在本文中，我们引入了一个新的协变量平衡准则，它测量了处理组协变量核密度估计之间的差异。为了在随机分配治疗之前实现协变量平衡，我们通过最小化标准来划分实验单元，然后将治疗水平随机分配到划分的组中。通过数值算例，我们证明了所提出的分割方法可以提高均值差估计器的精度，并且优于完全随机化和再随机化方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Statistical Theory and Related Fields Mathematics-Analysis

CiteScore

0.90

自引率

20.00%

发文量