Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956842
Rachael Phillips, Mark van der Laan
{"title":"Commentary on ``Nonparametric identification is not enough, but randomized controlled trials are'': Statistical considerations for generating reliable evidence across a spectrum of studies that increasingly involve real-world elements.","authors":"Rachael Phillips, Mark van der Laan","doi":"10.1353/obs.2025.a956842","DOIUrl":"10.1353/obs.2025.a956842","url":null,"abstract":"<p><p>Judea Pearl, quoted in Pearl and Mackenzie (2008), stated that \"once we have understood why [randomized controlled trials] RCTs work, there is no need to put them on a pedestal and treat them as the gold standard of causal analysis, which all other methods should emulate.\" In Aronow et al. (2024), this claim is refuted, drawing on results of Robins and Ritov (1997). The argument is made that statistical estimation and inference tend to be fundamentally more difficult in observational studies than in randomized controlled trials, even when all confounders are observed and measured without error. We congratulate the authors for raising this highly timely, interesting discussion and welcome this opportunity to join this important debate. In this commentary, we focus on what it takes to generate reliable evidence across a spectrum of studies that increasingly involve real-world elements and less control over design. A related question is whether, along this spectrum of studies, the reliability of evidence generated by a statistical analysis decreases. We claim that this is not the case, but that the challenge for the appropriate statistical method increases, requiring sophisticated and careful execution.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"61-76"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139718/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956837
P M Aronow, James M Robins, Theo Saarinen, Fredrik Sävje, Jasjeet S Sekhon
{"title":"Nonparametric identification is not enough, but randomized controlled trials are.","authors":"P M Aronow, James M Robins, Theo Saarinen, Fredrik Sävje, Jasjeet S Sekhon","doi":"10.1353/obs.2025.a956837","DOIUrl":"10.1353/obs.2025.a956837","url":null,"abstract":"<p><p>We argue that randomized controlled trials (RCTs) are special even among studies for which a nonparametric unconfoundedness assumption is credible. This claim follows from two results of Robins and Ritov (1997). First, in settings with at least one continuous confounder, there exists no estimator of the average treatment effect that is uniformly consistent unless the propensity score is known or additional assumptions are made on the complexity of the propensity score function. Second, with binary outcomes, knowledge of the propensity score yields a uniformly consistent estimator and finite-sample valid confidence intervals that shrink at a parametric rate, regardless of how complicated the propensity score function might be. We emphasize the latter point, and note that a successfully executed RCT provides knowledge of the propensity score to the researcher. We conclude that statistical estimation and inference tend to be fundamentally more difficult in observational settings than in RCTs, even when all confounders are observed and measured without error.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"3-16"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139723/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251202","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956840
Christopher Harshaw
{"title":"Why are RCTs the Gold Standard? The Epistemological Difference Between Randomized Experiments and Observational Studies.","authors":"Christopher Harshaw","doi":"10.1353/obs.2025.a956840","DOIUrl":"10.1353/obs.2025.a956840","url":null,"abstract":"<p><p>In response to Pearl, Aronow et al. (2025) argue that randomized experiments are special among causal inference methods due to their statistical properties. I believe that the key distinction between randomized experiments and observational studies is not statistical, but rather epistemological in nature. In this comment, I aim to articulate this epistemological distinction and argue that it ought to take a more central role in these discussions.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"41-46"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139715/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956838
Drew Dimmery, Kevin Munger
{"title":"Enough?","authors":"Drew Dimmery, Kevin Munger","doi":"10.1353/obs.2025.a956838","DOIUrl":"10.1353/obs.2025.a956838","url":null,"abstract":"<p><p>We provide a critical response to Aronow et al. (2021) which argued that randomized controlled trials (RCTs) are \"enough,\" while nonparametric identification in observational studies is not. We first investigate what is meant by \"enough,\" arguing that this is a fundamentally a sociological claim about the relationship between statistical work and relevant institutional processes (here, academic peer review), rather than something that can be decided from within the logic of statistics. For a more complete conception of \"enough,\" we outline all that would need to be known - not just knowledge of propensity scores, but knowledge of many other spatial and temporal characteristics of the social world. Even granting the logic of the critique in Aronow et al. (2021), its practical importance is a question of the contexts under study. We argue that we should not be satisfied by appeals to intuition or experience about the complexity of \"naturally occurring\" propensity score functions. Instead, we call for more empirical metascience to begin to characterize this complexity. We apply this logic to the case of recommender systems as a demonstration of the weakness of allowing statisticians' intuitions to serve in place of metascientific data. This may be, as Aronow et al. (2021) claim, one of the \"few free lunches in statistics\"-but like many of the free lunches consumed by statisticians, it is only available to those working at a handful of large tech firms. Rather than implicitly deciding what is \"enough\" based on statistical applications the social world has determined to be most profitable, we are argue that practicing statisticians should explicitly engage with questions like \"for what?\" and \"for whom?\" in order to adequately answer the question of \"enough?\"</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"17-26"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139716/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251201","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956844
P M Aronow, James M Robins, Theo Saarinen, Fredrik Sävje, Jasjeet S Sekhon
{"title":"Rejoinder: Nonparametric identification is not enough, but randomized controlled trials are.","authors":"P M Aronow, James M Robins, Theo Saarinen, Fredrik Sävje, Jasjeet S Sekhon","doi":"10.1353/obs.2025.a956844","DOIUrl":"10.1353/obs.2025.a956844","url":null,"abstract":"<p><p>We thank the editor for organizing a diverse and wide-ranging discussion, and we thank the commentators for their detailed and thoughtful remarks. Most of the commentators provide broader perspectives on randomized experiments and their role in modern empirical practice. We believe this broader perspective is important, and the comments serve as complements to the somewhat narrow points we made in our paper. However, we believe these narrow points are of great consequence, and we find it useful to briefly recapitulate them here. When a practitioner aims to estimate averages of bounded potential outcomes (e.g., the average treatment effect on a binary outcome) in a setting where both ignorability and positivity are known to hold after adjusting for at least one continuous covariate, the following statements are true: • If the propensity score is known, such as in a randomized controlled trial (RCT), there exist simple estimators that are uniformly root-n consistent and asymptotically normal. Confidence intervals based on these estimators are finite-sample valid and their widths shrink at a root-n rate. • If the propensity score is not known, such as in an observational study, there exist neither uniformly consistent estimators nor uniform (i.e., honest) large-sample confidence intervals whose widths are shrinking with the sample size. To achieve these properties, the practitioner must impose untestable assumptions on either the propensity score function or the conditional expectation function of the outcomes.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"85-90"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139717/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956843
Benjamin Recht
{"title":"A Bureaucratic Theory of Statistics.","authors":"Benjamin Recht","doi":"10.1353/obs.2025.a956843","DOIUrl":"10.1353/obs.2025.a956843","url":null,"abstract":"<p><p>This commentary proposes a framework for understanding the role of statistics in policymaking, regulation, and bureaucratic systems. I introduce the concept of \"ex ante policy,\" describing statistical rules and procedures designed before data collection to govern future actions. Through examining examples, particularly clinical trials, I explore how ex ante policy serves as a calculus of bureaucracy, providing numerical foundations for governance through clear, transparent rules. The ex ante frame obviates heated debates about inferential interpretations of probability and statistical tests, p-values, and rituals. I conclude by calling for a deeper appreciation of statistics' bureaucratic function and suggesting new directions for research in policy-oriented statistical methodology.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"77-84"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139714/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956841
Arman Oganisian, Antonio Linero
{"title":"Priors and Propensity Scores in Bayesian Causal Inference.","authors":"Arman Oganisian, Antonio Linero","doi":"10.1353/obs.2025.a956841","DOIUrl":"10.1353/obs.2025.a956841","url":null,"abstract":"<p><p>Aronow et al. (2025) provide a convincing case for the special status of randomized controlled trials (RCTs) in which the propensity scores are known and can be used to make causal inferences. Here we provide a Bayesian perspective on their work by summarizing recent developments in the Bayesian literature on the topic. Whether the propensity score should play a role in Bayesian causal inference - and what that role(s) should be - has been a controversial topic for some time. We begin by describing Bayesian inference for population-level estimands and show that under commonly made (but not necessarily required) assumptions, the propensity score model has no role to play in Bayesian causal inference from a purist perspective. We discuss recent work on why these assumptions can be problematic - particularly in high-dimensional models - and discuss several Bayesian motivations for relaxing them. We describe out recent approaches for incorporating the propensity score correspond to di erent ways of relaxing these assumptions. Given these considerations, we illustrate how a Bayesian might approach the synethic examples of Aronow et al. (2025).</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"47-60"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139722/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Observational studiesPub Date : 2025-04-11eCollection Date: 2025-01-01DOI: 10.1353/obs.2025.a956839
Peng Ding
{"title":"What randomization can and cannot guarantee.","authors":"Peng Ding","doi":"10.1353/obs.2025.a956839","DOIUrl":"10.1353/obs.2025.a956839","url":null,"abstract":"<p><p>Aronow et al. (2024) provide a great service to the causal inference community by delineating the key results in Robins and Ritov (1997). They show that randomized controlled trials (RCTs) ensure much stronger statistical inference than unconfounded observational studies even though nonparametric identification is identical in both settings. These results are in sharp contrast to the claim in Pearl and Mackenzie (2018) that RCTs are not the gold standard of causal analysis. Pearl and Mackenzie's (2018) claim is false and misleading for empirical researchers who want to infer causal effects based on data with finite sample sizes. I will further review what randomization can and cannot guarantee more broadly. In particular, I will highlight the value of randomization-based inference in RCTs, the limit of randomization alone for more complicated causal inference questions, and the importance of sensitivity analysis in observational studies.</p>","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"11 1","pages":"27-40"},"PeriodicalIF":0.0,"publicationDate":"2025-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12139720/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144251205","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Does matching introduce confounding or selection bias into the matched case-control design?","authors":"Fei Wan, S. Sutcliffe, Jeffrey Zhang, Dylan Small","doi":"10.1353/obs.2024.a929114","DOIUrl":"https://doi.org/10.1353/obs.2024.a929114","url":null,"abstract":"Abstract:The impact of matching on confounding control in case-control studies remains a subject of ongoing debate, with varying perspectives among researchers. While matching is a well-established method for controlling confounding in cohort studies, its effectiveness in mitigating confounding in case-control studies has long been questioned. Recent studies have determined that matching doesn't eliminate confounding but, instead, introduces a selection bias on top of the initial confounding, as indicated by causal diagram analysis. This conclusion suggests that the control of initial confounding through matching is either only partial or non-existent. However, this conclusion may not be accurate in exactly matched design because causal diagram cannot always reveal precisely the interplay between the initial confounding and the matching induced selection effect. In this paper, we employ analytical results in conjunction with causal diagrams to demonstrate that the cancellation of the initial confounding by the selection effect is complete in exact individually matched case-control studies. Nevertheless, this cancellation results in a residual selection effect that establishes a backdoor connection between the matching factors and the outcome in the matched design. Failure to adjust for this residual selection effect leads to biased estimates of the exposure effect. Furthermore, this backdoor connection causes matching factors to act like confounding factors in the matched case-control design, which complicates the interpretation of the bias introduced by matching in current literature.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"321 4","pages":"1 - 9"},"PeriodicalIF":0.0,"publicationDate":"2024-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141381359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Using a difference-in-difference control trial to test an intervention aimed at increasing the take-up of a welfare payment in New Zealand","authors":"David Rea, Dean R. Hyslop","doi":"10.1353/obs.2023.a906626","DOIUrl":"https://doi.org/10.1353/obs.2023.a906626","url":null,"abstract":"Abstract:This paper describes a difference-in-difference control trial (DDCT) of an intervention designed to increase the take-up of an income support payment in the New Zealand welfare system. The intervention used a microsimulation model to identify potential claimants who were then contacted by either phone, email, or letter. The trial was designed as a DDCT because of ethical concerns associated with a fully randomized approach. The trial provided convincing evidence that the intervention would increase the take-up of the payment and a modified version was then implemented as an ongoing business process by the New Zealand Ministry of Social Development (MSD). The findings from the trial contribute to the literature about how best to increase the take-up of welfare payments. The study also demonstrates the value of using a difference-in-difference control trial.","PeriodicalId":74335,"journal":{"name":"Observational studies","volume":"9 1","pages":"49 - 72"},"PeriodicalIF":0.0,"publicationDate":"2023-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46729290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}