{"title":"基于设计和基于分析的方法对观测数据进行因果推理的比较研究","authors":"Junni L. Zhang","doi":"10.1080/24709360.2021.1992246","DOIUrl":null,"url":null,"abstract":"Causal inference with observational data is a central goal in many fields. Propensity score methods are design-based approaches that try to ensure covariate balance without using information from the outcome variables. Analysis-based approaches, such as the Bayesian Additive Regression Tree and the Causal Forest, bypass the issue of covariate balance, and directly model the outcomes. We use a Monte Carlo simulation to study the performance of these two types of approaches. Some of the simulation scenarios involve large number of covariates relative to the number of observations. We find that the analysis-based approaches can yield very poor performance, without any warning about not enough overlap between the covariate distributions for the treated and control groups. In contrast, the propensity score methods provide warning about not enough overlap, but such warning could be overly-cautious when there is enough overlap.","PeriodicalId":37240,"journal":{"name":"Biostatistics and Epidemiology","volume":"6 1","pages":"239 - 248"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A comparative study of design-based and analysis-based approaches to causal inference with observational data\",\"authors\":\"Junni L. Zhang\",\"doi\":\"10.1080/24709360.2021.1992246\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Causal inference with observational data is a central goal in many fields. Propensity score methods are design-based approaches that try to ensure covariate balance without using information from the outcome variables. Analysis-based approaches, such as the Bayesian Additive Regression Tree and the Causal Forest, bypass the issue of covariate balance, and directly model the outcomes. We use a Monte Carlo simulation to study the performance of these two types of approaches. Some of the simulation scenarios involve large number of covariates relative to the number of observations. We find that the analysis-based approaches can yield very poor performance, without any warning about not enough overlap between the covariate distributions for the treated and control groups. In contrast, the propensity score methods provide warning about not enough overlap, but such warning could be overly-cautious when there is enough overlap.\",\"PeriodicalId\":37240,\"journal\":{\"name\":\"Biostatistics and Epidemiology\",\"volume\":\"6 1\",\"pages\":\"239 - 248\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biostatistics and Epidemiology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1080/24709360.2021.1992246\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biostatistics and Epidemiology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/24709360.2021.1992246","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Medicine","Score":null,"Total":0}
A comparative study of design-based and analysis-based approaches to causal inference with observational data
Causal inference with observational data is a central goal in many fields. Propensity score methods are design-based approaches that try to ensure covariate balance without using information from the outcome variables. Analysis-based approaches, such as the Bayesian Additive Regression Tree and the Causal Forest, bypass the issue of covariate balance, and directly model the outcomes. We use a Monte Carlo simulation to study the performance of these two types of approaches. Some of the simulation scenarios involve large number of covariates relative to the number of observations. We find that the analysis-based approaches can yield very poor performance, without any warning about not enough overlap between the covariate distributions for the treated and control groups. In contrast, the propensity score methods provide warning about not enough overlap, but such warning could be overly-cautious when there is enough overlap.