{"title":"No star is good news: A unified look at rerandomization based on p-values from covariate balance tests","authors":"Anqi Zhao , Peng Ding","doi":"10.1016/j.jeconom.2024.105724","DOIUrl":null,"url":null,"abstract":"<div><p>Randomized experiments balance all covariates on average and are considered the gold standard for estimating treatment effects. Chance imbalances are nonetheless common in realized treatment allocations. To inform readers of the comparability of treatment groups at baseline, contemporary scientific publications often report covariate balance tables with not only covariate means by treatment group but also the associated <span><math><mi>p</mi></math></span>-values from significance tests of their differences. The practical need to avoid small <span><math><mi>p</mi></math></span>-values as indicators of poor balance motivates balance check and rerandomization based on these <span><math><mi>p</mi></math></span>-values from covariate balance tests (ReP) as an attractive tool for improving covariate balance in designing randomized experiments. Despite the intuitiveness of such strategy and its possibly already widespread use in practice, the literature lacks results about its implications on subsequent inference, subjecting many effectively rerandomized experiments to possibly inefficient analyses. To fill this gap, we examine a variety of potentially useful schemes for ReP and quantify their impact on subsequent inference. Specifically, we focus on three estimators of the average treatment effect from the unadjusted, additive, and interacted linear regressions of the outcome on treatment, respectively, and derive their asymptotic sampling properties under ReP. The main findings are threefold. First, the estimator from the interacted regression is asymptotically the most efficient under all ReP schemes examined, and permits convenient regression-assisted inference identical to that under complete randomization. Second, ReP, in contrast to complete randomization, improves the asymptotic efficiency of the estimators from the unadjusted and additive regressions. Standard regression analyses are accordingly still valid but in general overconservative. Third, ReP reduces the asymptotic conditional biases of the three estimators and improves their coherence in terms of mean squared difference. These results establish ReP as a convenient tool for improving covariate balance in designing randomized experiments, and we recommend using the interacted regression for analyzing data from ReP designs.</p></div>","PeriodicalId":15629,"journal":{"name":"Journal of Econometrics","volume":"241 1","pages":"Article 105724"},"PeriodicalIF":9.9000,"publicationDate":"2024-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Econometrics","FirstCategoryId":"96","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0304407624000708","RegionNum":3,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ECONOMICS","Score":null,"Total":0}
引用次数: 0
Abstract
Randomized experiments balance all covariates on average and are considered the gold standard for estimating treatment effects. Chance imbalances are nonetheless common in realized treatment allocations. To inform readers of the comparability of treatment groups at baseline, contemporary scientific publications often report covariate balance tables with not only covariate means by treatment group but also the associated -values from significance tests of their differences. The practical need to avoid small -values as indicators of poor balance motivates balance check and rerandomization based on these -values from covariate balance tests (ReP) as an attractive tool for improving covariate balance in designing randomized experiments. Despite the intuitiveness of such strategy and its possibly already widespread use in practice, the literature lacks results about its implications on subsequent inference, subjecting many effectively rerandomized experiments to possibly inefficient analyses. To fill this gap, we examine a variety of potentially useful schemes for ReP and quantify their impact on subsequent inference. Specifically, we focus on three estimators of the average treatment effect from the unadjusted, additive, and interacted linear regressions of the outcome on treatment, respectively, and derive their asymptotic sampling properties under ReP. The main findings are threefold. First, the estimator from the interacted regression is asymptotically the most efficient under all ReP schemes examined, and permits convenient regression-assisted inference identical to that under complete randomization. Second, ReP, in contrast to complete randomization, improves the asymptotic efficiency of the estimators from the unadjusted and additive regressions. Standard regression analyses are accordingly still valid but in general overconservative. Third, ReP reduces the asymptotic conditional biases of the three estimators and improves their coherence in terms of mean squared difference. These results establish ReP as a convenient tool for improving covariate balance in designing randomized experiments, and we recommend using the interacted regression for analyzing data from ReP designs.
期刊介绍:
The Journal of Econometrics serves as an outlet for important, high quality, new research in both theoretical and applied econometrics. The scope of the Journal includes papers dealing with identification, estimation, testing, decision, and prediction issues encountered in economic research. Classical Bayesian statistics, and machine learning methods, are decidedly within the range of the Journal''s interests. The Annals of Econometrics is a supplement to the Journal of Econometrics.