基于马氏距离的最优选择再随机化的渐近理论

arXiv - MATH - Statistics Theory Pub Date : 2023-12-05 DOI:arxiv-2312.02513

Yuhao Wang, Xinran Li

{"title":"基于马氏距离的最优选择再随机化的渐近理论","authors":"Yuhao Wang, Xinran Li","doi":"arxiv-2312.02513","DOIUrl":null,"url":null,"abstract":"Rerandomization, a design that utilizes pretreatment covariates and improves\ntheir balance between different treatment groups, has received attention\nrecently in both theory and practice. There are at least two types of\nrerandomization that are used in practice: the first rerandomizes the treatment\nassignment until covariate imbalance is below a prespecified threshold; the\nsecond randomizes the treatment assignment multiple times and chooses the one\nwith the best covariate balance. In this paper we will consider the second type\nof rerandomization, namely the best-choice rerandomization, whose theory and\ninference are still lacking in the literature. In particular, we will focus on\nthe best-choice rerandomization that uses the Mahalanobis distance to measure\ncovariate imbalance, which is one of the most commonly used imbalance measure\nfor multivariate covariates and is invariant to affine transformations of\ncovariates. We will study the large-sample repeatedly sampling properties of\nthe best-choice rerandomization, allowing both the number of covariates and the\nnumber of tried complete randomizations to increase with the sample size. We\nshow that the asymptotic distribution of the difference-in-means estimator is\nmore concentrated around the true average treatment effect under\nrerandomization than under the complete randomization, and propose large-sample\naccurate confidence intervals for rerandomization that are shorter than that\nfor the completely randomized experiment. We further demonstrate that, with\nmoderate number of covariates and with the number of tried randomizations\nincreasing polynomially with the sample size, the best-choice rerandomization\ncan achieve the ideally optimal precision that one can expect even with\nperfectly balanced covariates. The developed theory and methods for\nrerandomization are also illustrated using real field experiments.","PeriodicalId":501330,"journal":{"name":"arXiv - MATH - Statistics Theory","volume":"84 6","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Asymptotic Theory of the Best-Choice Rerandomization using the Mahalanobis Distance\",\"authors\":\"Yuhao Wang, Xinran Li\",\"doi\":\"arxiv-2312.02513\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Rerandomization, a design that utilizes pretreatment covariates and improves\\ntheir balance between different treatment groups, has received attention\\nrecently in both theory and practice. There are at least two types of\\nrerandomization that are used in practice: the first rerandomizes the treatment\\nassignment until covariate imbalance is below a prespecified threshold; the\\nsecond randomizes the treatment assignment multiple times and chooses the one\\nwith the best covariate balance. In this paper we will consider the second type\\nof rerandomization, namely the best-choice rerandomization, whose theory and\\ninference are still lacking in the literature. In particular, we will focus on\\nthe best-choice rerandomization that uses the Mahalanobis distance to measure\\ncovariate imbalance, which is one of the most commonly used imbalance measure\\nfor multivariate covariates and is invariant to affine transformations of\\ncovariates. We will study the large-sample repeatedly sampling properties of\\nthe best-choice rerandomization, allowing both the number of covariates and the\\nnumber of tried complete randomizations to increase with the sample size. We\\nshow that the asymptotic distribution of the difference-in-means estimator is\\nmore concentrated around the true average treatment effect under\\nrerandomization than under the complete randomization, and propose large-sample\\naccurate confidence intervals for rerandomization that are shorter than that\\nfor the completely randomized experiment. We further demonstrate that, with\\nmoderate number of covariates and with the number of tried randomizations\\nincreasing polynomially with the sample size, the best-choice rerandomization\\ncan achieve the ideally optimal precision that one can expect even with\\nperfectly balanced covariates. The developed theory and methods for\\nrerandomization are also illustrated using real field experiments.\",\"PeriodicalId\":501330,\"journal\":{\"name\":\"arXiv - MATH - Statistics Theory\",\"volume\":\"84 6\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-12-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - MATH - Statistics Theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2312.02513\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - MATH - Statistics Theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2312.02513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

再随机化是一种利用预处理协变量并改善其在不同治疗组之间的平衡的设计，最近在理论和实践中都受到了关注。实践中至少有两种类型的再随机化:第一种是对治疗分配进行再随机化，直到协变量不平衡低于预先指定的阈值;第二种方法将治疗分配随机化多次，并选择协变量平衡最佳的治疗分配。在本文中，我们将考虑第二种类型的再随机化，即最佳选择再随机化，其理论和推理在文献中仍然缺乏。特别是，我们将关注使用马氏距离来测量变量不平衡的最佳选择再随机化，这是最常用的多变量协变量不平衡度量之一，并且对协变量的仿射变换是不变的。我们将研究最佳选择再随机化的大样本重复抽样特性，允许协变量的数量和尝试完全随机化的数量随着样本量的增加而增加。表明在再随机化条件下，均值差估计量的渐近分布比完全随机化条件下更集中在真实平均治疗效果周围，并提出了比完全随机化条件下更短的大样本精确置信区间。我们进一步证明，在适度数量的协变量和尝试随机化的数量随样本量多项式增加的情况下，最佳选择的再随机化可以达到理想的最优精度，即使在完全平衡的协变量下也是如此。本文还通过实际的现场实验说明了发展起来的随机化理论和方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Asymptotic Theory of the Best-Choice Rerandomization using the Mahalanobis Distance

Rerandomization, a design that utilizes pretreatment covariates and improves their balance between different treatment groups, has received attention recently in both theory and practice. There are at least two types of rerandomization that are used in practice: the first rerandomizes the treatment assignment until covariate imbalance is below a prespecified threshold; the second randomizes the treatment assignment multiple times and chooses the one with the best covariate balance. In this paper we will consider the second type of rerandomization, namely the best-choice rerandomization, whose theory and inference are still lacking in the literature. In particular, we will focus on the best-choice rerandomization that uses the Mahalanobis distance to measure covariate imbalance, which is one of the most commonly used imbalance measure for multivariate covariates and is invariant to affine transformations of covariates. We will study the large-sample repeatedly sampling properties of the best-choice rerandomization, allowing both the number of covariates and the number of tried complete randomizations to increase with the sample size. We show that the asymptotic distribution of the difference-in-means estimator is more concentrated around the true average treatment effect under rerandomization than under the complete randomization, and propose large-sample accurate confidence intervals for rerandomization that are shorter than that for the completely randomized experiment. We further demonstrate that, with moderate number of covariates and with the number of tried randomizations increasing polynomially with the sample size, the best-choice rerandomization can achieve the ideally optimal precision that one can expect even with perfectly balanced covariates. The developed theory and methods for rerandomization are also illustrated using real field experiments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - MATH - Statistics Theory

自引率

0.00%

发文量