Evaluating heterogeneity in indoor and outdoor air pollution using land-use regression and constrained factor analysis.

Jonathan I Levy, Jane E Clougherty, Lisa K Baxter, E Andres Houseman, Christopher J Paciorek
{"title":"Evaluating heterogeneity in indoor and outdoor air pollution using land-use regression and constrained factor analysis.","authors":"Jonathan I Levy,&nbsp;Jane E Clougherty,&nbsp;Lisa K Baxter,&nbsp;E Andres Houseman,&nbsp;Christopher J Paciorek","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>Previous studies have identified associations between traffic exposures and a variety of adverse health effects, but many of these studies relied on proximity measures rather than measured or modeled concentrations of specific air pollutants, complicating interpretability of the findings. An increasing number of studies have used land-use regression (LUR) or other techniques to model small-scale variability in concentrations of specific air pollutants. However, these studies have generally considered a limited number of pollutants, focused on outdoor concentrations (or indoor concentrations of ambient origin) when indoor concentrations are better proxies for personal exposures, and have not taken full advantage of statistical methods for source apportionment that may have provided insight about the structure of the LUR models and the interpretability of model results. Given these issues, the primary objective of our study was to determine predictors of indoor and outdoor residential concentrations of multiple traffic-related air pollutants within an urban area, based on a combination of central site monitoring data; geographic information system (GIS) covariates reflecting traffic and other outdoor sources; questionnaire data reflecting indoor sources and activities that affect ventilation rates; and factor-analytic methods to better infer source contributions. As part of a prospective birth cohort study assessing asthma etiology in urban Boston, we collected indoor and/or outdoor 3-to-4 day samples of nitrogen dioxide (NO2) and fine particulate matter with an aerodynamic diameter or = 2.5 pm (PM2.5) at 44 residences during multiple seasons of the year from 2003 through 2005. We performed reflectance analysis, x-ray fluorescence spectroscopy (XRF), and high-resolution inductively coupled plasma-mass spectrometry (ICP-MS) on particle filters to estimate the concentrations of elemental carbon (EC), trace elements, and water-soluble metals, respectively. We derived multiple indicators of traffic using Massachusetts Highway Department (MHD) data and traffic counts collected outside the residences where the air monitoring was conducted. We used a standardized questionnaire to collect data on home characteristics and occupant behaviors. Additional housing information was collected through property tax records. Ambient concentrations of pollutants as well as meteorological data were collected from centrally located ambient monitors. We used GIS-based LUR models to explain spatial and temporal variability in residential outdoor concentrations of PM2.5, EC, and NO2. We subsequently derived latent-source factors for residential outdoor concentrations using confirmatory factor analysis constrained to nonnegative loadings. We developed LUR models to determine whether GIS covariates and other predictors explain factor variability and thereby support initial factor interpretations. To evaluate indoor concentrations, we developed physically interpretable regression models that explored the relationship between measured indoor and outdoor concentrations, relying on questionnaire data to characterize indoor sources and activities. Because outdoor pollutant concentrations measured directly outside of homes are unlikely to be available for most large epidemiologic studies, we developed regression models to explain indoor concentrations of PM2.5, EC, and NO2 as a function of other, more readily available data: GIS covariates, questionnaire data reflecting both sources and ventilation, and central site monitoring data. As we did for outdoor concentrations, we then derived latent-source factors for residential indoor concentrations and developed regression models explaining variability in these indoor latent-source factors. Finally, to provide insight about the effects of improved characterization of exposures for the results of subsequent epidemiologic investigations, we developed a simulation framework to quantitatively compare the implications of using exposure models derived from validation studies with the use of other surrogate models with varying amounts of measurement error. The concentrations of outdoor PM2.5 were strongly associated with the central site monitor data, whereas EC concentrations showed greater spatial variability, especially during colder months, and were predicted by the length of roadway within 200 m of the home. Outdoor NO2 also showed significant spatial variability, predicted in part by population density and roadway length within 50 m of the home. Our constrained factor analysis of outdoor concentrations produced loadings indicating long-range transport, brake wear and traffic exhaust, diesel exhaust, fuel oil combustion, and resuspended road dust as sources; corresponding LUR models largely corroborated these factor interpretations through covariate significance. For example, long-range transport was predicted by central site PM2.5, and season, brake wear and traffic exhaust and resuspended road dust by traffic and residential density, diesel exhaust by the percentage of diesel traffic on the nearest major road, and fuel oil combustion by population density. Our modeling of the concentrations of indoor pollutants demonstrated substantial variability in indoor-outdoor relationships across constituents, helping to separate constituents dominated by outdoor sources (e.g., S, Se, and V) from those dominated by indoor sources (e.g., Ca and Si). Regression models indicated that indoor PM2.5 was not influenced substantially by local traffic but had significant indoor sources (cooking activity and occupant density), while EC was associated with distance to the nearest designated truck route, and NO2 was associated with both traffic density within 50 m of the home and gas stove usage. Our constrained factor analysis of indoor concentrations helped to separate outdoor-dominated factors from indoor-dominated factors, though some factors appeared to be influenced by both indoor and outdoor sources. Subsequent factor analyses of the indoor-attributable fractions from indoor-outdoor regression models provided generally consistent interpretations of indoor-dominated factors. The use of regression models on indoor factors demonstrated the limited predictive power of questionnaire data related to indoor sources, but reinforced the viability of modeling indoor concentrations of pollutants of ambient origin. In spite of the relatively weak predictive power of some of the indoor-concentration regression models, our epidemiologic simulations illustrated that exposure models with fairly modest R2 values (in the range of 0.3 through 0.4, corresponding with the regression models for PM2.5 and NO2) yielded substantial improvements in epidemiologic study performance relative to the use of exposure proxies that could be applied in the absence of validation studies. In spite of limitations related to sample size and available covariate data, our study demonstrated significant outdoor spatial variability within an urban area in NO2 and in several constituents of airborne particles. LUR techniques combined with constrained factor analysis helped to disentangle the contributions to temporal variability of local, long-range transport, and other sources, ultimately allowing exposures from defined source categories to be investigated in epidemiologic studies. For the indoor residential environment, we demonstrated substantial variability in indoor-outdoor relationships among particle constituents; then, using information from public databases and focused questionnaire data, we were able to predict indoor concentrations for a subset of key pollutants. Constrained factor analysis methods applied to the indoor environment helped to separate indoor sources from outdoor sources. The corresponding indoor regression models had limited predictive power, reinforcing the complexity of characterizing the indoor environment when only limited information about key predictors is available. This finding also underscores the likelihood that these regression models might characterize indoor concentrations of pollutants with ambient origins better than they can the indoor concentrations from all sources. Our findings provide direction for future studies characterizing indoor exposure sources and patterns, and our epidemiologic simulation reinforced the importance of reducing measurement error in a context where many traffic-related air pollutants are influenced by both indoor and outdoor sources. The combination of analytical techniques used in our study could ultimately allow for more refined exposure characterization and evaluation of the relative contributions of various sources to health outcomes in epidemiologic studies.</p>","PeriodicalId":74687,"journal":{"name":"Research report (Health Effects Institute)","volume":" 152","pages":"5-80; discussion 81-91"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research report (Health Effects Institute)","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Previous studies have identified associations between traffic exposures and a variety of adverse health effects, but many of these studies relied on proximity measures rather than measured or modeled concentrations of specific air pollutants, complicating interpretability of the findings. An increasing number of studies have used land-use regression (LUR) or other techniques to model small-scale variability in concentrations of specific air pollutants. However, these studies have generally considered a limited number of pollutants, focused on outdoor concentrations (or indoor concentrations of ambient origin) when indoor concentrations are better proxies for personal exposures, and have not taken full advantage of statistical methods for source apportionment that may have provided insight about the structure of the LUR models and the interpretability of model results. Given these issues, the primary objective of our study was to determine predictors of indoor and outdoor residential concentrations of multiple traffic-related air pollutants within an urban area, based on a combination of central site monitoring data; geographic information system (GIS) covariates reflecting traffic and other outdoor sources; questionnaire data reflecting indoor sources and activities that affect ventilation rates; and factor-analytic methods to better infer source contributions. As part of a prospective birth cohort study assessing asthma etiology in urban Boston, we collected indoor and/or outdoor 3-to-4 day samples of nitrogen dioxide (NO2) and fine particulate matter with an aerodynamic diameter or = 2.5 pm (PM2.5) at 44 residences during multiple seasons of the year from 2003 through 2005. We performed reflectance analysis, x-ray fluorescence spectroscopy (XRF), and high-resolution inductively coupled plasma-mass spectrometry (ICP-MS) on particle filters to estimate the concentrations of elemental carbon (EC), trace elements, and water-soluble metals, respectively. We derived multiple indicators of traffic using Massachusetts Highway Department (MHD) data and traffic counts collected outside the residences where the air monitoring was conducted. We used a standardized questionnaire to collect data on home characteristics and occupant behaviors. Additional housing information was collected through property tax records. Ambient concentrations of pollutants as well as meteorological data were collected from centrally located ambient monitors. We used GIS-based LUR models to explain spatial and temporal variability in residential outdoor concentrations of PM2.5, EC, and NO2. We subsequently derived latent-source factors for residential outdoor concentrations using confirmatory factor analysis constrained to nonnegative loadings. We developed LUR models to determine whether GIS covariates and other predictors explain factor variability and thereby support initial factor interpretations. To evaluate indoor concentrations, we developed physically interpretable regression models that explored the relationship between measured indoor and outdoor concentrations, relying on questionnaire data to characterize indoor sources and activities. Because outdoor pollutant concentrations measured directly outside of homes are unlikely to be available for most large epidemiologic studies, we developed regression models to explain indoor concentrations of PM2.5, EC, and NO2 as a function of other, more readily available data: GIS covariates, questionnaire data reflecting both sources and ventilation, and central site monitoring data. As we did for outdoor concentrations, we then derived latent-source factors for residential indoor concentrations and developed regression models explaining variability in these indoor latent-source factors. Finally, to provide insight about the effects of improved characterization of exposures for the results of subsequent epidemiologic investigations, we developed a simulation framework to quantitatively compare the implications of using exposure models derived from validation studies with the use of other surrogate models with varying amounts of measurement error. The concentrations of outdoor PM2.5 were strongly associated with the central site monitor data, whereas EC concentrations showed greater spatial variability, especially during colder months, and were predicted by the length of roadway within 200 m of the home. Outdoor NO2 also showed significant spatial variability, predicted in part by population density and roadway length within 50 m of the home. Our constrained factor analysis of outdoor concentrations produced loadings indicating long-range transport, brake wear and traffic exhaust, diesel exhaust, fuel oil combustion, and resuspended road dust as sources; corresponding LUR models largely corroborated these factor interpretations through covariate significance. For example, long-range transport was predicted by central site PM2.5, and season, brake wear and traffic exhaust and resuspended road dust by traffic and residential density, diesel exhaust by the percentage of diesel traffic on the nearest major road, and fuel oil combustion by population density. Our modeling of the concentrations of indoor pollutants demonstrated substantial variability in indoor-outdoor relationships across constituents, helping to separate constituents dominated by outdoor sources (e.g., S, Se, and V) from those dominated by indoor sources (e.g., Ca and Si). Regression models indicated that indoor PM2.5 was not influenced substantially by local traffic but had significant indoor sources (cooking activity and occupant density), while EC was associated with distance to the nearest designated truck route, and NO2 was associated with both traffic density within 50 m of the home and gas stove usage. Our constrained factor analysis of indoor concentrations helped to separate outdoor-dominated factors from indoor-dominated factors, though some factors appeared to be influenced by both indoor and outdoor sources. Subsequent factor analyses of the indoor-attributable fractions from indoor-outdoor regression models provided generally consistent interpretations of indoor-dominated factors. The use of regression models on indoor factors demonstrated the limited predictive power of questionnaire data related to indoor sources, but reinforced the viability of modeling indoor concentrations of pollutants of ambient origin. In spite of the relatively weak predictive power of some of the indoor-concentration regression models, our epidemiologic simulations illustrated that exposure models with fairly modest R2 values (in the range of 0.3 through 0.4, corresponding with the regression models for PM2.5 and NO2) yielded substantial improvements in epidemiologic study performance relative to the use of exposure proxies that could be applied in the absence of validation studies. In spite of limitations related to sample size and available covariate data, our study demonstrated significant outdoor spatial variability within an urban area in NO2 and in several constituents of airborne particles. LUR techniques combined with constrained factor analysis helped to disentangle the contributions to temporal variability of local, long-range transport, and other sources, ultimately allowing exposures from defined source categories to be investigated in epidemiologic studies. For the indoor residential environment, we demonstrated substantial variability in indoor-outdoor relationships among particle constituents; then, using information from public databases and focused questionnaire data, we were able to predict indoor concentrations for a subset of key pollutants. Constrained factor analysis methods applied to the indoor environment helped to separate indoor sources from outdoor sources. The corresponding indoor regression models had limited predictive power, reinforcing the complexity of characterizing the indoor environment when only limited information about key predictors is available. This finding also underscores the likelihood that these regression models might characterize indoor concentrations of pollutants with ambient origins better than they can the indoor concentrations from all sources. Our findings provide direction for future studies characterizing indoor exposure sources and patterns, and our epidemiologic simulation reinforced the importance of reducing measurement error in a context where many traffic-related air pollutants are influenced by both indoor and outdoor sources. The combination of analytical techniques used in our study could ultimately allow for more refined exposure characterization and evaluation of the relative contributions of various sources to health outcomes in epidemiologic studies.

利用土地利用回归和约束因子分析评价室内外空气污染异质性。
以前的研究已经确定了交通暴露与各种不良健康影响之间的联系,但其中许多研究依赖于邻近测量,而不是测量或模拟特定空气污染物的浓度,这使研究结果的可解释性变得复杂。越来越多的研究使用土地利用回归或其他技术来模拟特定空气污染物浓度的小尺度变化。然而,这些研究通常只考虑有限数量的污染物,关注室外浓度(或室内环境源浓度),而室内浓度可以更好地代表个人暴露,并且没有充分利用可能提供关于LUR模型结构和模型结果可解释性的见解的来源分配统计方法。考虑到这些问题,我们研究的主要目标是基于中心站点监测数据的组合,确定城市区域内多种交通相关空气污染物的室内和室外住宅浓度的预测因子;反映交通和其他室外来源的地理信息系统(GIS)协变量;反映影响通风率的室内来源和活动的问卷调查数据;因子分析方法可以更好地推断源贡献。作为一项评估波士顿城市哮喘病因的前瞻性出生队列研究的一部分,我们在2003年至2005年的多个季节收集了44个住宅室内和/或室外3至4天的二氧化氮(NO2)和空气动力学直径或= 2.5 pm (PM2.5)的细颗粒物样本。我们对颗粒过滤器进行了反射率分析、x射线荧光光谱(XRF)和高分辨率电感耦合等离子体质谱(ICP-MS),分别估计了元素碳(EC)、微量元素和水溶性金属的浓度。我们使用马萨诸塞州高速公路部(MHD)的数据和在进行空气监测的住宅外收集的交通计数,得出了多个交通指标。我们使用标准化问卷来收集关于家庭特征和居住者行为的数据。通过财产税记录收集了额外的住房信息。从设在中心的环境监测仪收集污染物的环境浓度以及气象数据。我们使用基于gis的LUR模型来解释住宅室外PM2.5、EC和NO2浓度的时空变化。随后,我们利用非负负荷的验证性因子分析推导出住宅室外浓度的潜在源因子。我们开发了LUR模型,以确定GIS协变量和其他预测因子是否解释因子变异,从而支持初始因子解释。为了评估室内浓度,我们开发了物理可解释的回归模型,探索室内和室外测量浓度之间的关系,依靠问卷调查数据来表征室内来源和活动。由于在室外直接测量的室外污染物浓度不太可能用于大多数大型流行病学研究,因此我们开发了回归模型来解释PM2.5, EC和NO2的室内浓度作为其他更容易获得的数据的函数:GIS协变量,反映源和通风的问卷数据,以及中心站点监测数据。正如我们对室外浓度所做的那样,我们随后推导了住宅室内浓度的潜在源因素,并建立了回归模型来解释这些室内潜在源因素的可变性。最后,为了深入了解暴露特征的改进对后续流行病学调查结果的影响,我们开发了一个模拟框架,以定量比较使用来自验证研究的暴露模型与使用其他具有不同测量误差的替代模型的影响。室外PM2.5浓度与中心站点监测数据密切相关,而EC浓度表现出更大的空间变异性,特别是在寒冷的月份,并通过距离住宅200米以内的道路长度来预测。室外NO2也表现出显著的空间变异性,部分由人口密度和住宅周围50米范围内的道路长度预测。我们对室外浓度的约束因子分析产生的负荷表明,长距离运输、刹车磨损和交通尾气、柴油尾气、燃油燃烧和重悬浮道路粉尘是来源;相应的LUR模型通过协变量显著性在很大程度上证实了这些因素的解释。例如,中心站点PM2预测了长距离输运。 5、与季节有关,刹车磨损与交通尾气和重悬浮道路粉尘由交通和居民密度决定,柴油尾气由最近主要道路上柴油车辆的百分比决定,燃油燃烧由人口密度决定。我们对室内污染物浓度的建模显示了各成分之间室内外关系的实质性变化,有助于将室外源(如S、Se和V)占主导地位的成分与室内源(如Ca和Si)占主导地位的成分区分开来。回归模型表明,室内PM2.5不受当地交通的影响,但有显著的室内来源(烹饪活动和乘员密度),而EC与最近指定卡车路线的距离有关,NO2与家庭50 m内的交通密度和燃气灶使用有关。我们对室内浓度的约束因子分析有助于将室外主导因素从室内主导因素中分离出来,尽管有些因素似乎同时受到室内和室外来源的影响。随后对室内-室外回归模型的室内归因分数进行因子分析,对室内主导因素提供了大致一致的解释。对室内因素的回归模型的使用表明,与室内源相关的问卷数据的预测能力有限,但增强了模拟环境源污染物室内浓度的可行性。尽管一些室内浓度回归模型的预测能力相对较弱,但我们的流行病学模拟表明,R2值相当适中的暴露模型(在0.3至0.4范围内,与PM2.5和NO2的回归模型相对应)在流行病学研究中取得了实质性的进步,相对于使用暴露代理,可以在缺乏验证研究的情况下应用。尽管样本量和可用协变量数据存在局限性,但我们的研究表明,在城市区域内,二氧化氮和空气中颗粒物的几种成分存在显著的室外空间变异性。LUR技术与约束因子分析相结合,有助于解开本地、远程传输和其他来源对时间变异性的贡献,最终允许在流行病学研究中调查来自定义源类别的暴露。对于室内居住环境,我们证明了颗粒成分之间的室内-室外关系的实质性变化;然后,利用来自公共数据库的信息和集中的问卷调查数据,我们能够预测室内主要污染物的浓度。应用于室内环境的约束因子分析方法有助于将室内源与室外源分离开来。相应的室内回归模型预测能力有限,在关键预测因子信息有限的情况下,增加了表征室内环境的复杂性。这一发现还强调了这样一种可能性,即这些回归模型可以更好地描述环境源污染物的室内浓度,而不是所有来源的室内浓度。我们的研究结果为未来室内暴露源和模式的研究提供了方向,我们的流行病学模拟强调了在许多与交通有关的空气污染物受到室内和室外源影响的背景下减少测量误差的重要性。在我们的研究中使用的分析技术的组合最终可以允许更精确的暴露特征和评估各种来源对流行病学研究中健康结果的相对贡献。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信