Katrina L Kezios, Scott C Zimmerman, Peter T Buto, Kara E Rudolph, Sebastian Calonico, Adina Zeki Al Hazzouri, M Maria Glymour
{"title":"Overcoming Data Gaps in Life Course Epidemiology by Matching Across Cohorts.","authors":"Katrina L Kezios, Scott C Zimmerman, Peter T Buto, Kara E Rudolph, Sebastian Calonico, Adina Zeki Al Hazzouri, M Maria Glymour","doi":"10.1097/EDE.0000000000001761","DOIUrl":null,"url":null,"abstract":"<p><p>Life course epidemiology is hampered by the absence of large studies with exposures and outcomes measured at different life stages in the same individuals. We describe when the effect of an exposure ( A ) on an outcome ( Y ) in a target population is identifiable in a combined (\"synthetic\") cohort created by pooling an early-life cohort including measures of A with a late-life cohort including measures of Y . We enumerate causal assumptions needed for unbiased effect estimation in the synthetic cohort and illustrate by simulating target populations under four causal models. From each target population, we randomly sampled early- and late-life cohorts and created a synthetic cohort by matching individuals from the two cohorts based on mediators and confounders. We estimated the effect of A on Y in the synthetic cohort, varying matching variables, the match ratio, and the strength of association between matching variables and A . Finally, we compared bias in the synthetic cohort estimates when matching variables did not d-separate A and Y to the bias expected in the original cohort. When the set of matching variables includes all variables d-connecting exposure and outcome (i.e., variables blocking all backdoor and front-door pathways), the synthetic cohort yields unbiased effect estimates. Even when matching variables did not fully account for confounders, the synthetic cohort estimate was sometimes less biased than comparable estimates in the original cohort. Methods based on merging cohorts may hasten the evaluation of early- and mid-life determinants of late-life health but rely on available measures of both confounders and mediators.</p>","PeriodicalId":11779,"journal":{"name":"Epidemiology","volume":null,"pages":null},"PeriodicalIF":4.7000,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11305898/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1097/EDE.0000000000001761","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/7/5 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"PUBLIC, ENVIRONMENTAL & OCCUPATIONAL HEALTH","Score":null,"Total":0}
引用次数: 0
Abstract
Life course epidemiology is hampered by the absence of large studies with exposures and outcomes measured at different life stages in the same individuals. We describe when the effect of an exposure ( A ) on an outcome ( Y ) in a target population is identifiable in a combined ("synthetic") cohort created by pooling an early-life cohort including measures of A with a late-life cohort including measures of Y . We enumerate causal assumptions needed for unbiased effect estimation in the synthetic cohort and illustrate by simulating target populations under four causal models. From each target population, we randomly sampled early- and late-life cohorts and created a synthetic cohort by matching individuals from the two cohorts based on mediators and confounders. We estimated the effect of A on Y in the synthetic cohort, varying matching variables, the match ratio, and the strength of association between matching variables and A . Finally, we compared bias in the synthetic cohort estimates when matching variables did not d-separate A and Y to the bias expected in the original cohort. When the set of matching variables includes all variables d-connecting exposure and outcome (i.e., variables blocking all backdoor and front-door pathways), the synthetic cohort yields unbiased effect estimates. Even when matching variables did not fully account for confounders, the synthetic cohort estimate was sometimes less biased than comparable estimates in the original cohort. Methods based on merging cohorts may hasten the evaluation of early- and mid-life determinants of late-life health but rely on available measures of both confounders and mediators.
生命历程流行病学因缺乏在同一个体的不同生命阶段测量暴露和结果的大型研究而受到阻碍。我们列举了在合成队列中进行无偏效应估计所需的因果假设,并通过模拟四种因果模型下的目标人群进行了说明。我们从每个目标人群中随机抽取早期和晚期人群,并根据中介因素和混杂因素对两个人群中的个体进行匹配,从而创建一个合成人群。我们通过改变匹配变量、匹配率以及匹配变量与 A 之间的关联强度,估计了合成队列中 A 对 Y 的影响。最后,我们比较了当匹配变量没有将 A 和 Y 区分开时,合成队列估计值的偏差与原始队列中的预期偏差。当匹配变量集包括所有将暴露和结果 d 连接起来的变量(即阻断所有后门和前门途径的变量)时,合成队列产生的效应估计值是无偏的。即使在匹配变量没有完全考虑混杂因素的情况下,合成队列估计值的偏差有时也小于原始队列的可比估计值。基于合并队列的方法可以加快对晚年健康的早期和中期决定因素的评估,但需要依赖对混杂因素和中介因素的可用测量。
期刊介绍:
Epidemiology publishes original research from all fields of epidemiology. The journal also welcomes review articles and meta-analyses, novel hypotheses, descriptions and applications of new methods, and discussions of research theory or public health policy. We give special consideration to papers from developing countries.