Jiaxin Zhang, S. Ghazaleh Dashti, John B. Carlin, Katherine J. Lee, Margarita Moreno-Betancur
{"title":"典型多变量缺失机制下因果效应的可恢复性和估计","authors":"Jiaxin Zhang, S. Ghazaleh Dashti, John B. Carlin, Katherine J. Lee, Margarita Moreno-Betancur","doi":"10.1002/bimj.202200326","DOIUrl":null,"url":null,"abstract":"<p>In the context of missing data, the identifiability or “recoverability” of the average causal effect (ACE) depends not only on the usual causal assumptions but also on missingness assumptions that can be depicted by adding variable-specific missingness indicators to causal diagrams, creating missingness directed acyclic graphs (m-DAGs). Previous research described canonical m-DAGs, representing typical multivariable missingness mechanisms in epidemiological studies, and examined mathematically the recoverability of the ACE in each case. However, this work assumed no effect modification and did not investigate methods for estimation across such scenarios. Here, we extend this research by determining the recoverability of the ACE in settings with effect modification and conducting a simulation study to evaluate the performance of widely used missing data methods when estimating the ACE using correctly specified g-computation. Methods assessed were complete case analysis (CCA) and various implementations of multiple imputation (MI) with varying degrees of compatibility with the outcome model used in g-computation. Simulations were based on an example from the Victorian Adolescent Health Cohort Study (VAHCS), where interest was in estimating the ACE of adolescent cannabis use on mental health in young adulthood. We found that the ACE is recoverable when no incomplete variable (exposure, outcome, or confounder) causes its own missingness, and nonrecoverable otherwise, in simplified versions of 10 canonical m-DAGs that excluded unmeasured common causes of missingness indicators. Despite this nonrecoverability, simulations showed that MI approaches that are compatible with the outcome model in g-computation may enable approximately unbiased estimation across all canonical m-DAGs considered, except when the outcome causes its own missingness or causes the missingness of a variable that causes its own missingness. In the latter settings, researchers may need to consider sensitivity analysis methods incorporating external information (e.g., delta-adjustment methods). The VAHCS case study illustrates the practical implications of these findings.</p>","PeriodicalId":55360,"journal":{"name":"Biometrical Journal","volume":"66 3","pages":""},"PeriodicalIF":1.3000,"publicationDate":"2024-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200326","citationCount":"0","resultStr":"{\"title\":\"Recoverability and estimation of causal effects under typical multivariable missingness mechanisms\",\"authors\":\"Jiaxin Zhang, S. Ghazaleh Dashti, John B. Carlin, Katherine J. Lee, Margarita Moreno-Betancur\",\"doi\":\"10.1002/bimj.202200326\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>In the context of missing data, the identifiability or “recoverability” of the average causal effect (ACE) depends not only on the usual causal assumptions but also on missingness assumptions that can be depicted by adding variable-specific missingness indicators to causal diagrams, creating missingness directed acyclic graphs (m-DAGs). Previous research described canonical m-DAGs, representing typical multivariable missingness mechanisms in epidemiological studies, and examined mathematically the recoverability of the ACE in each case. However, this work assumed no effect modification and did not investigate methods for estimation across such scenarios. Here, we extend this research by determining the recoverability of the ACE in settings with effect modification and conducting a simulation study to evaluate the performance of widely used missing data methods when estimating the ACE using correctly specified g-computation. Methods assessed were complete case analysis (CCA) and various implementations of multiple imputation (MI) with varying degrees of compatibility with the outcome model used in g-computation. Simulations were based on an example from the Victorian Adolescent Health Cohort Study (VAHCS), where interest was in estimating the ACE of adolescent cannabis use on mental health in young adulthood. We found that the ACE is recoverable when no incomplete variable (exposure, outcome, or confounder) causes its own missingness, and nonrecoverable otherwise, in simplified versions of 10 canonical m-DAGs that excluded unmeasured common causes of missingness indicators. Despite this nonrecoverability, simulations showed that MI approaches that are compatible with the outcome model in g-computation may enable approximately unbiased estimation across all canonical m-DAGs considered, except when the outcome causes its own missingness or causes the missingness of a variable that causes its own missingness. In the latter settings, researchers may need to consider sensitivity analysis methods incorporating external information (e.g., delta-adjustment methods). The VAHCS case study illustrates the practical implications of these findings.</p>\",\"PeriodicalId\":55360,\"journal\":{\"name\":\"Biometrical Journal\",\"volume\":\"66 3\",\"pages\":\"\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2024-04-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/bimj.202200326\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Biometrical Journal\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/bimj.202200326\",\"RegionNum\":3,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biometrical Journal","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/bimj.202200326","RegionNum":3,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}
引用次数: 0
摘要
在数据缺失的情况下,平均因果效应(ACE)的可识别性或 "可恢复性 "不仅取决于通常的因果假设,还取决于缺失假设,缺失假设可以通过在因果图中添加特定变量的缺失指标来描述,从而创建缺失有向无环图(m-DAGs)。之前的研究描述了代表流行病学研究中典型多变量缺失机制的典型 m-DAG,并对每种情况下 ACE 的可恢复性进行了数学分析。但是,这项研究假设没有效应修正,也没有研究在这种情况下的估算方法。在此,我们扩展了这项研究,确定了在有效应修饰的情况下 ACE 的可恢复性,并进行了一项模拟研究,以评估广泛使用的缺失数据方法在使用正确指定的 g 计算估计 ACE 时的性能。评估的方法包括完整病例分析(CCA)和多重估算(MI)的各种实现方法,它们与 g 计算中使用的结果模型的兼容程度各不相同。模拟以维多利亚青少年健康队列研究(VAHCS)中的一个实例为基础,该研究的目的是估算青少年使用大麻对其成年后心理健康的影响。我们发现,在简化版的 10 个典型 m-DAG 中,如果没有不完整变量(暴露、结果或混杂因素)导致其自身的缺失,ACE 是可以恢复的,反之则不可恢复。尽管存在这种不可恢复性,但模拟结果表明,在 g 计算中与结果模型兼容的多元智能方法可以在所考虑的所有典型 m-DAG 中实现近似无偏估计,除非结果导致自身缺失或导致导致自身缺失的变量的缺失。在后一种情况下,研究人员可能需要考虑结合外部信息的敏感性分析方法(如三角调整方法)。VAHCS 案例研究说明了这些发现的实际意义。
Recoverability and estimation of causal effects under typical multivariable missingness mechanisms
In the context of missing data, the identifiability or “recoverability” of the average causal effect (ACE) depends not only on the usual causal assumptions but also on missingness assumptions that can be depicted by adding variable-specific missingness indicators to causal diagrams, creating missingness directed acyclic graphs (m-DAGs). Previous research described canonical m-DAGs, representing typical multivariable missingness mechanisms in epidemiological studies, and examined mathematically the recoverability of the ACE in each case. However, this work assumed no effect modification and did not investigate methods for estimation across such scenarios. Here, we extend this research by determining the recoverability of the ACE in settings with effect modification and conducting a simulation study to evaluate the performance of widely used missing data methods when estimating the ACE using correctly specified g-computation. Methods assessed were complete case analysis (CCA) and various implementations of multiple imputation (MI) with varying degrees of compatibility with the outcome model used in g-computation. Simulations were based on an example from the Victorian Adolescent Health Cohort Study (VAHCS), where interest was in estimating the ACE of adolescent cannabis use on mental health in young adulthood. We found that the ACE is recoverable when no incomplete variable (exposure, outcome, or confounder) causes its own missingness, and nonrecoverable otherwise, in simplified versions of 10 canonical m-DAGs that excluded unmeasured common causes of missingness indicators. Despite this nonrecoverability, simulations showed that MI approaches that are compatible with the outcome model in g-computation may enable approximately unbiased estimation across all canonical m-DAGs considered, except when the outcome causes its own missingness or causes the missingness of a variable that causes its own missingness. In the latter settings, researchers may need to consider sensitivity analysis methods incorporating external information (e.g., delta-adjustment methods). The VAHCS case study illustrates the practical implications of these findings.
期刊介绍:
Biometrical Journal publishes papers on statistical methods and their applications in life sciences including medicine, environmental sciences and agriculture. Methodological developments should be motivated by an interesting and relevant problem from these areas. Ideally the manuscript should include a description of the problem and a section detailing the application of the new methodology to the problem. Case studies, review articles and letters to the editors are also welcome. Papers containing only extensive mathematical theory are not suitable for publication in Biometrical Journal.