{"title":"比较点暴露与缺失混杂因素的因果推理方法:模拟研究。","authors":"Luke Benz, Alexander W Levis, Sebastien Haneuse","doi":"10.1186/s12874-025-02675-2","DOIUrl":null,"url":null,"abstract":"<p><p>Causal inference methods based on electronic health record (EHR) databases must simultaneously handle confounding and missing data. In practice, when faced with partially missing confounders, analysts may proceed by first imputing missing data and subsequently using outcome regression or inverse-probability weighting (IPW) to address confounding. However, little is known about the theoretical performance of such reasonable, but ad hoc methods. Though vast literature exists on each of these two challenges separately, relatively few works attempt to address missing data and confounding in a formal manner simultaneously. In a recent paper Levis et al. (Can J Stat e11832, 2024) outlined a robust framework for tackling these problems together under certain identifying conditions, and introduced a pair of estimators for the average treatment effect (ATE), one of which is non-parametric efficient. In this work we present a series of simulations, motivated by a published EHR based study (Arterburn et al., Ann Surg 274:e1269-e1276, 2020) of the long-term effects of bariatric surgery on weight outcomes, to investigate these new estimators and compare them to existing ad hoc methods. While methods based on ad hoc combinations of imputation and confounding adjustment perform well in certain scenarios, no single estimator is uniformly best. We conclude with recommendations for good practice in the face of partially missing confounders.</p>","PeriodicalId":9114,"journal":{"name":"BMC Medical Research Methodology","volume":"25 1","pages":"222"},"PeriodicalIF":3.4000,"publicationDate":"2025-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482880/pdf/","citationCount":"0","resultStr":"{\"title\":\"Comparing causal inference methods for point exposures with missing confounders: a simulation study.\",\"authors\":\"Luke Benz, Alexander W Levis, Sebastien Haneuse\",\"doi\":\"10.1186/s12874-025-02675-2\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Causal inference methods based on electronic health record (EHR) databases must simultaneously handle confounding and missing data. In practice, when faced with partially missing confounders, analysts may proceed by first imputing missing data and subsequently using outcome regression or inverse-probability weighting (IPW) to address confounding. However, little is known about the theoretical performance of such reasonable, but ad hoc methods. Though vast literature exists on each of these two challenges separately, relatively few works attempt to address missing data and confounding in a formal manner simultaneously. In a recent paper Levis et al. (Can J Stat e11832, 2024) outlined a robust framework for tackling these problems together under certain identifying conditions, and introduced a pair of estimators for the average treatment effect (ATE), one of which is non-parametric efficient. In this work we present a series of simulations, motivated by a published EHR based study (Arterburn et al., Ann Surg 274:e1269-e1276, 2020) of the long-term effects of bariatric surgery on weight outcomes, to investigate these new estimators and compare them to existing ad hoc methods. While methods based on ad hoc combinations of imputation and confounding adjustment perform well in certain scenarios, no single estimator is uniformly best. We conclude with recommendations for good practice in the face of partially missing confounders.</p>\",\"PeriodicalId\":9114,\"journal\":{\"name\":\"BMC Medical Research Methodology\",\"volume\":\"25 1\",\"pages\":\"222\"},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2025-09-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12482880/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"BMC Medical Research Methodology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1186/s12874-025-02675-2\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"HEALTH CARE SCIENCES & SERVICES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Research Methodology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12874-025-02675-2","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"HEALTH CARE SCIENCES & SERVICES","Score":null,"Total":0}
引用次数: 0
摘要
基于电子健康档案(EHR)数据库的因果推理方法必须同时处理混杂数据和缺失数据。在实践中,当面对部分缺失的混杂因素时,分析人员可能首先输入缺失的数据,然后使用结果回归或逆概率加权(IPW)来处理混杂因素。然而,人们对这种合理而又特别的方法的理论性能知之甚少。虽然有大量的文献分别讨论这两个挑战,但相对较少的作品试图同时以正式的方式解决数据缺失和混淆问题。在最近的一篇论文中,Levis等人(Can J Stat e11832, 2024)概述了一个鲁棒框架,用于在某些识别条件下共同解决这些问题,并引入了一对平均处理效果(ATE)的估计器,其中一个是非参数有效的。在这项工作中,我们提出了一系列模拟,其动机是基于一项已发表的基于电子健康记录的研究(Arterburn等人,Ann Surg 274:e1269-e1276, 2020),研究减肥手术对体重结果的长期影响,研究这些新的估计,并将它们与现有的特设方法进行比较。虽然在某些情况下,基于特别组合的估算和混杂调整的方法表现良好,但没有单一的估计器是统一的最佳估计。最后,我们提出了面对部分缺失的混杂因素时的良好实践建议。
Comparing causal inference methods for point exposures with missing confounders: a simulation study.
Causal inference methods based on electronic health record (EHR) databases must simultaneously handle confounding and missing data. In practice, when faced with partially missing confounders, analysts may proceed by first imputing missing data and subsequently using outcome regression or inverse-probability weighting (IPW) to address confounding. However, little is known about the theoretical performance of such reasonable, but ad hoc methods. Though vast literature exists on each of these two challenges separately, relatively few works attempt to address missing data and confounding in a formal manner simultaneously. In a recent paper Levis et al. (Can J Stat e11832, 2024) outlined a robust framework for tackling these problems together under certain identifying conditions, and introduced a pair of estimators for the average treatment effect (ATE), one of which is non-parametric efficient. In this work we present a series of simulations, motivated by a published EHR based study (Arterburn et al., Ann Surg 274:e1269-e1276, 2020) of the long-term effects of bariatric surgery on weight outcomes, to investigate these new estimators and compare them to existing ad hoc methods. While methods based on ad hoc combinations of imputation and confounding adjustment perform well in certain scenarios, no single estimator is uniformly best. We conclude with recommendations for good practice in the face of partially missing confounders.
期刊介绍:
BMC Medical Research Methodology is an open access journal publishing original peer-reviewed research articles in methodological approaches to healthcare research. Articles on the methodology of epidemiological research, clinical trials and meta-analysis/systematic review are particularly encouraged, as are empirical studies of the associations between choice of methodology and study outcomes. BMC Medical Research Methodology does not aim to publish articles describing scientific methods or techniques: these should be directed to the BMC journal covering the relevant biomedical subject area.