A ROBUST AND EFFICIENT APPROACH TO CAUSAL INFERENCE BASED ON SPARSE SUFFICIENT DIMENSION REDUCTION.

IF 3.2 1区数学 Q1 STATISTICS & PROBABILITY

Annals of Statistics Pub Date : 2019-06-01 Epub Date: 2019-02-13 DOI:10.1214/18-AOS1722

Shujie Ma, Liping Zhu, Zhiwei Zhang, Chih-Ling Tsai, Raymond J Carroll

{"title":"A ROBUST AND EFFICIENT APPROACH TO CAUSAL INFERENCE BASED ON SPARSE SUFFICIENT DIMENSION REDUCTION.","authors":"Shujie Ma, Liping Zhu, Zhiwei Zhang, Chih-Ling Tsai, Raymond J Carroll","doi":"10.1214/18-AOS1722","DOIUrl":null,"url":null,"abstract":"<p><p>A fundamental assumption used in causal inference with observational data is that treatment assignment is ignorable given measured confounding variables. This assumption of no missing confounders is plausible if a large number of baseline covariates are included in the analysis, as we often have no prior knowledge of which variables can be important confounders. Thus, estimation of treatment effects with a large number of covariates has received considerable attention in recent years. Most existing methods require specifying certain parametric models involving the outcome, treatment and confounding variables, and employ a variable selection procedure to identify confounders. However, selection of a proper set of confounders depends on correct specification of the working models. The bias due to model misspecification and incorrect selection of confounding variables can yield misleading results. We propose a robust and efficient approach for inference about the average treatment effect via a flexible modeling strategy incorporating penalized variable selection. Specifically, we consider an estimator constructed based on an efficient influence function that involves a propensity score and an outcome regression. We then propose a new sparse sufficient dimension reduction method to estimate these two functions without making restrictive parametric modeling assumptions. The proposed estimator of the average treatment effect is asymptotically normal and semiparametrically efficient without the need for variable selection consistency. The proposed methods are illustrated via simulation studies and a biomedical application.</p>","PeriodicalId":8032,"journal":{"name":"Annals of Statistics","volume":"47 3","pages":"1505-1535"},"PeriodicalIF":3.2000,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1214/18-AOS1722","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Annals of Statistics","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1214/18-AOS1722","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2019/2/13 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}

引用次数: 25

Abstract

A fundamental assumption used in causal inference with observational data is that treatment assignment is ignorable given measured confounding variables. This assumption of no missing confounders is plausible if a large number of baseline covariates are included in the analysis, as we often have no prior knowledge of which variables can be important confounders. Thus, estimation of treatment effects with a large number of covariates has received considerable attention in recent years. Most existing methods require specifying certain parametric models involving the outcome, treatment and confounding variables, and employ a variable selection procedure to identify confounders. However, selection of a proper set of confounders depends on correct specification of the working models. The bias due to model misspecification and incorrect selection of confounding variables can yield misleading results. We propose a robust and efficient approach for inference about the average treatment effect via a flexible modeling strategy incorporating penalized variable selection. Specifically, we consider an estimator constructed based on an efficient influence function that involves a propensity score and an outcome regression. We then propose a new sparse sufficient dimension reduction method to estimate these two functions without making restrictive parametric modeling assumptions. The proposed estimator of the average treatment effect is asymptotically normal and semiparametrically efficient without the need for variable selection consistency. The proposed methods are illustrated via simulation studies and a biomedical application.

Abstract Image

查看原文本刊更多论文

一种基于稀疏充分降维的稳健有效的因果推理方法。

在观察数据的因果推断中使用的一个基本假设是，给定测量的混杂变量，治疗分配是可以忽略的。如果分析中包括大量的基线协变量，这种没有遗漏混杂因素的假设是合理的，因为我们通常不知道哪些变量可能是重要的混杂因素。因此，近年来，用大量协变量估计治疗效果受到了相当大的关注。大多数现有方法需要指定涉及结果、治疗和混杂变量的某些参数模型，并采用变量选择程序来识别混杂因素。然而，一组合适的混杂因素的选择取决于工作模型的正确规范。由于模型的错误指定和混杂变量的错误选择而产生的偏差可能会产生误导性的结果。我们提出了一种稳健有效的方法，通过结合惩罚变量选择的灵活建模策略来推断平均治疗效果。具体来说，我们考虑一个基于有效影响函数构建的估计器，该函数涉及倾向得分和结果回归。然后，我们提出了一种新的稀疏充分降维方法来估计这两个函数，而不需要进行限制性的参数建模假设。所提出的平均治疗效果的估计量是渐近正态的和半参数有效的，不需要变量选择一致性。通过仿真研究和生物医学应用对所提出的方法进行了说明。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Annals of Statistics 数学-统计学与概率论

CiteScore

9.30

自引率

8.90%

发文量

119

审稿时长

6-12 weeks

期刊介绍： The Annals of Statistics aim to publish research papers of highest quality reflecting the many facets of contemporary statistics. Primary emphasis is placed on importance and originality, not on formalism. The journal aims to cover all areas of statistics, especially mathematical statistics and applied & interdisciplinary statistics. Of course many of the best papers will touch on more than one of these general areas, because the discipline of statistics has deep roots in mathematics, and in substantive scientific fields.