{"title":"Causal Inference and Survey Data in Paediatric Epidemiology: Generalising Treatment Effects From Observational Data.","authors":"Lizbeth Burgos-Ochoa, Felix J Clouth","doi":"10.1111/ppe.70042","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Survey data are essential in paediatric epidemiology, providing valuable insights into child health outcomes. The potential outcomes framework has advanced causal inference using observational data. However, traditional design-based adjustments, especially sample weights, are often overlooked. This omission limits the ability to generalise findings to the broader population.</p><p><strong>Objective: </strong>This study demonstrates three approaches for estimating the population average treatment effect (PATE) in a practical example, examining the impact of household second-hand smoke (SHS) exposure on blood pressure in school-aged children.</p><p><strong>Methods: </strong>Using data from the National Health and Nutrition Examination Survey (NHANES) 2017-2020, we assessed the effect of household SHS exposure, a non-randomised treatment, on blood pressure in school-aged children. We applied estimators based on Inverse Probability of Treatment Weighting (IPTW), G-computation, Targeted Maximum Likelihood Estimation (TMLE), and regression adjustment. Models without adjustments were run for comparison. We examined point estimates and the efficiency of the estimates obtained from these methods.</p><p><strong>Results: </strong>The largest differences were observed between the unadjusted regression models and the fully adjusted methods (IPTW, G-computation, and TMLE), which account for both confounding and survey weights. While the inclusion of the sample weights leads to wider confidence intervals for all methods, G-computation and TMLE showed comparatively narrower confidence intervals. Confidence intervals for the models not adjusted for sample weights were likely underestimated.</p><p><strong>Conclusions: </strong>This study highlights the important role of sample weights in causal inference. Generalisability of the average treatment effect as estimated on data sampled using common survey designs to a defined population requires the use of sample weights. The estimators described provide a framework for incorporating sample weights, and their use in health research is recommended.</p>","PeriodicalId":19698,"journal":{"name":"Paediatric and perinatal epidemiology","volume":" ","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Paediatric and perinatal epidemiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1111/ppe.70042","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OBSTETRICS & GYNECOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Survey data are essential in paediatric epidemiology, providing valuable insights into child health outcomes. The potential outcomes framework has advanced causal inference using observational data. However, traditional design-based adjustments, especially sample weights, are often overlooked. This omission limits the ability to generalise findings to the broader population.
Objective: This study demonstrates three approaches for estimating the population average treatment effect (PATE) in a practical example, examining the impact of household second-hand smoke (SHS) exposure on blood pressure in school-aged children.
Methods: Using data from the National Health and Nutrition Examination Survey (NHANES) 2017-2020, we assessed the effect of household SHS exposure, a non-randomised treatment, on blood pressure in school-aged children. We applied estimators based on Inverse Probability of Treatment Weighting (IPTW), G-computation, Targeted Maximum Likelihood Estimation (TMLE), and regression adjustment. Models without adjustments were run for comparison. We examined point estimates and the efficiency of the estimates obtained from these methods.
Results: The largest differences were observed between the unadjusted regression models and the fully adjusted methods (IPTW, G-computation, and TMLE), which account for both confounding and survey weights. While the inclusion of the sample weights leads to wider confidence intervals for all methods, G-computation and TMLE showed comparatively narrower confidence intervals. Confidence intervals for the models not adjusted for sample weights were likely underestimated.
Conclusions: This study highlights the important role of sample weights in causal inference. Generalisability of the average treatment effect as estimated on data sampled using common survey designs to a defined population requires the use of sample weights. The estimators described provide a framework for incorporating sample weights, and their use in health research is recommended.
期刊介绍:
Paediatric and Perinatal Epidemiology crosses the boundaries between the epidemiologist and the paediatrician, obstetrician or specialist in child health, ensuring that important paediatric and perinatal studies reach those clinicians for whom the results are especially relevant. In addition to original research articles, the Journal also includes commentaries, book reviews and annotations.