The use and misuse of surrogate variables in environmental epidemiology

Journal of Environmental Medicine Pub Date : 1999-10-01 DOI:10.1002/JEM.40

F. Lipfert

{"title":"The use and misuse of surrogate variables in environmental epidemiology","authors":"F. Lipfert","doi":"10.1002/JEM.40","DOIUrl":null,"url":null,"abstract":"This paper discusses some common statistical problems that are often encountered in the specification and interpretation of regression models used in environmental epidemiology; such models have been used to establish new or modified ambient standards intended to protect public health. These statistical problems include: collinearity (identifying the ‘correct’ pollutant), confounding (omission of other variables that may be correlated with both response and putative dose), the ‘ecological fallacy’ (aggregating individual doses and responses over space or time), measurement error (uncertainties in data, applicability and measurement per se) and linearity (identifying curvature or thresholds in dose-response function). These problems occur in both time-series and cross-sectional studies. Although none of these potential problem areas is new, they have rarely been considered together or comprehensively. This paper considers them as specific instances of the general problem of surrogate variables, for which an analytical framework is presented together with some examples of their practical consequences and some guidelines for interpreting environmental epidemiology studies. Findings of the analysis include: single-pollutant regression models are likely to overstate effects; although aggregation results in loss of information, it biases the estimates only when confounding is present; the traditional approaches to correcting for measurement errors implied by the difference between personal exposures and ambient air quality do not apply, but estimates may be based on consideration of the ‘error’ term as an additional source of exposure; it may not be possible to deduce the correct shape of a dose-response function in the presence of measurement error and correlated covariates. These findings are intended to be descriptive rather than definitive; the main purpose is to stimulate the detailed research required to develop practical remedies that would allow epidemiology to be used appropriately in setting environmental standards.Copyright © 1999 John Wiley & Sons, Ltd.","PeriodicalId":100780,"journal":{"name":"Journal of Environmental Medicine","volume":"29 1","pages":"267-278"},"PeriodicalIF":0.0000,"publicationDate":"1999-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Environmental Medicine","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/JEM.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

This paper discusses some common statistical problems that are often encountered in the specification and interpretation of regression models used in environmental epidemiology; such models have been used to establish new or modified ambient standards intended to protect public health. These statistical problems include: collinearity (identifying the ‘correct’ pollutant), confounding (omission of other variables that may be correlated with both response and putative dose), the ‘ecological fallacy’ (aggregating individual doses and responses over space or time), measurement error (uncertainties in data, applicability and measurement per se) and linearity (identifying curvature or thresholds in dose-response function). These problems occur in both time-series and cross-sectional studies. Although none of these potential problem areas is new, they have rarely been considered together or comprehensively. This paper considers them as specific instances of the general problem of surrogate variables, for which an analytical framework is presented together with some examples of their practical consequences and some guidelines for interpreting environmental epidemiology studies. Findings of the analysis include: single-pollutant regression models are likely to overstate effects; although aggregation results in loss of information, it biases the estimates only when confounding is present; the traditional approaches to correcting for measurement errors implied by the difference between personal exposures and ambient air quality do not apply, but estimates may be based on consideration of the ‘error’ term as an additional source of exposure; it may not be possible to deduce the correct shape of a dose-response function in the presence of measurement error and correlated covariates. These findings are intended to be descriptive rather than definitive; the main purpose is to stimulate the detailed research required to develop practical remedies that would allow epidemiology to be used appropriately in setting environmental standards.Copyright © 1999 John Wiley & Sons, Ltd.

查看原文本刊更多论文

环境流行病学中替代变量的使用和误用

本文讨论了在环境流行病学回归模型的说明和解释中经常遇到的一些常见的统计问题;这些模型已用于制定旨在保护公众健康的新的或修改的环境标准。这些统计问题包括:共线性(确定“正确的”污染物)、混淆(遗漏可能与反应和假定剂量相关的其他变量)、“生态谬误”(在空间或时间上汇总个体剂量和反应)、测量误差(数据、适用性和测量本身的不确定性)和线性(确定剂量-反应函数的曲率或阈值)。这些问题在时间序列和横断面研究中都存在。虽然这些潜在的问题领域都不是新的，但它们很少被放在一起或全面地考虑。本文认为它们是替代变量一般问题的具体实例，为此提出了一个分析框架，以及它们的实际后果的一些例子和解释环境流行病学研究的一些指导方针。分析发现:单一污染物回归模型可能会夸大影响;虽然聚合会导致信息的丢失，但只有在存在混淆的情况下，它才会使估计产生偏差;对个人暴露与环境空气质量之间的差异所隐含的测量误差进行校正的传统方法不适用，但可以基于将“误差”一词作为额外暴露源的考虑进行估计;在存在测量误差和相关协变量的情况下，可能无法推导出剂量-响应函数的正确形状。这些发现是描述性的，而不是决定性的;其主要目的是促进所需的详细研究，以制定切实可行的补救办法，使流行病学在制定环境标准时得到适当的利用。版权所有©1999 John Wiley & Sons, Ltd

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Environmental Medicine

自引率

0.00%

发文量