{"title":"PCA Rerandomization","authors":"Hengtao Zhang, Guosheng Yin, Donald B. Rubin","doi":"10.1002/cjs.11765","DOIUrl":null,"url":null,"abstract":"<p>Mahalanobis distance of covariate means between treatment and control groups is often adopted as a balance criterion when implementing a rerandomization strategy. However, this criterion may not work well for high-dimensional cases because it balances all orthogonalized covariates equally. We propose using principal component analysis (PCA) to identify proper subspaces in which Mahalanobis distance should be calculated. Not only can PCA effectively reduce the dimensionality for high-dimensional covariates, but it also provides computational simplicity by focusing on the top orthogonal components. The PCA rerandomization scheme has desirable theoretical properties for balancing covariates and thereby improving the estimation of average treatment effects. This conclusion is supported by numerical studies using both simulated and real examples.</p>","PeriodicalId":0,"journal":{"name":"","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/cjs.11765","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cjs.11765","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Mahalanobis distance of covariate means between treatment and control groups is often adopted as a balance criterion when implementing a rerandomization strategy. However, this criterion may not work well for high-dimensional cases because it balances all orthogonalized covariates equally. We propose using principal component analysis (PCA) to identify proper subspaces in which Mahalanobis distance should be calculated. Not only can PCA effectively reduce the dimensionality for high-dimensional covariates, but it also provides computational simplicity by focusing on the top orthogonal components. The PCA rerandomization scheme has desirable theoretical properties for balancing covariates and thereby improving the estimation of average treatment effects. This conclusion is supported by numerical studies using both simulated and real examples.