Part 2. Development of Enhanced Statistical Methods for Assessing Health Effects Associated with an Unknown Number of Major Sources of Multiple Air Pollutants.

Eun Sug Park, Elaine Symanski, Daikwon Han, Clifford Spiegelman
{"title":"Part 2. Development of Enhanced Statistical Methods for Assessing Health Effects Associated with an Unknown Number of Major Sources of Multiple Air Pollutants.","authors":"Eun Sug Park,&nbsp;Elaine Symanski,&nbsp;Daikwon Han,&nbsp;Clifford Spiegelman","doi":"","DOIUrl":null,"url":null,"abstract":"<p><p>A major difficulty with assessing source-specific health effects is that source-specific exposures cannot be measured directly; rather, they need to be estimated by a source-apportionment method such as multivariate receptor modeling. The uncertainty in source apportionment (uncertainty in source-specific exposure estimates and model uncertainty due to the unknown number of sources and identifiability conditions) has been largely ignored in previous studies. Also, spatial dependence of multipollutant data collected from multiple monitoring sites has not yet been incorporated into multivariate receptor modeling. The objectives of this project are (1) to develop a multipollutant approach that incorporates both sources of uncertainty in source-apportionment into the assessment of source-specific health effects and (2) to develop enhanced multivariate receptor models that can account for spatial correlations in the multipollutant data collected from multiple sites. We employed a Bayesian hierarchical modeling framework consisting of multivariate receptor models, health-effects models, and a hierarchical model on latent source contributions. For the health model, we focused on the time-series design in this project. Each combination of number of sources and identifiability conditions (additional constraints on model parameters) defines a different model. We built a set of plausible models with extensive exploratory data analyses and with information from previous studies, and then computed posterior model probability to estimate model uncertainty. Parameter estimation and model uncertainty estimation were implemented simultaneously by Markov chain Monte Carlo (MCMC*) methods. We validated the methods using simulated data. We illustrated the methods using PM2.5 (particulate matter ≤ 2.5 μm in aerodynamic diameter) speciation data and mortality data from Phoenix, Arizona, and Houston, Texas. The Phoenix data included counts of cardiovascular deaths and daily PM2.5 speciation data from 1995-1997. The Houston data included respiratory mortality data and 24-hour PM2.5 speciation data sampled every six days from a region near the Houston Ship Channel in years 2002-2005. We also developed a Bayesian spatial multivariate receptor modeling approach that, while simultaneously dealing with the unknown number of sources and identifiability conditions, incorporated spatial correlations in the multipollutant data collected from multiple sites into the estimation of source profiles and contributions based on the discrete process convolution model for multivariate spatial processes. This new modeling approach was applied to 24-hour ambient air concentrations of 17 volatile organic compounds (VOCs) measured at nine monitoring sites in Harris County, Texas, during years 2000 to 2005. Simulation results indicated that our methods were accurate in identifying the true model and estimated parameters were close to the true values. The results from our methods agreed in general with previous studies on the source apportionment of the Phoenix data in terms of estimated source profiles and contributions. However, we had a greater number of statistically insignificant findings, which was likely a natural consequence of incorporating uncertainty in the estimated source contributions into the health-effects parameter estimation. For the Houston data, a model with five sources (that seemed to be Sulfate-Rich Secondary Aerosol, Motor Vehicles, Industrial Combustion, Soil/Crustal Matter, and Sea Salt) showed the highest posterior model probability among the candidate models considered when fitted simultaneously to the PM2.5 and mortality data. There was a statistically significant positive association between respiratory mortality and same-day PM2.5 concentrations attributed to one of the sources (probably industrial combustion). The Bayesian spatial multivariate receptor modeling approach applied to the VOC data led to a highest posterior model probability for a model with five sources (that seemed to be refinery, petrochemical production, gasoline evaporation, natural gas, and vehicular exhaust) among several candidate models, with the number of sources varying between three and seven and with different identifiability conditions. Our multipollutant approach assessing source-specific health effects is more advantageous than a single-pollutant approach in that it can estimate total health effects from multiple pollutants and can also identify emission sources that are responsible for adverse health effects. Our Bayesian approach can incorporate not only uncertainty in the estimated source contributions, but also model uncertainty that has not been addressed in previous studies on assessing source-specific health effects. The new Bayesian spatial multivariate receptor modeling approach enables predictions of source contributions at unmonitored sites, minimizing exposure misclassification and providing improved exposure estimates along with their uncertainty estimates, as well as accounting for uncertainty in the number of sources and identifiability conditions.</p>","PeriodicalId":74687,"journal":{"name":"Research report (Health Effects Institute)","volume":" 183 Pt 1-2","pages":"51-113"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research report (Health Effects Institute)","FirstCategoryId":"1085","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

A major difficulty with assessing source-specific health effects is that source-specific exposures cannot be measured directly; rather, they need to be estimated by a source-apportionment method such as multivariate receptor modeling. The uncertainty in source apportionment (uncertainty in source-specific exposure estimates and model uncertainty due to the unknown number of sources and identifiability conditions) has been largely ignored in previous studies. Also, spatial dependence of multipollutant data collected from multiple monitoring sites has not yet been incorporated into multivariate receptor modeling. The objectives of this project are (1) to develop a multipollutant approach that incorporates both sources of uncertainty in source-apportionment into the assessment of source-specific health effects and (2) to develop enhanced multivariate receptor models that can account for spatial correlations in the multipollutant data collected from multiple sites. We employed a Bayesian hierarchical modeling framework consisting of multivariate receptor models, health-effects models, and a hierarchical model on latent source contributions. For the health model, we focused on the time-series design in this project. Each combination of number of sources and identifiability conditions (additional constraints on model parameters) defines a different model. We built a set of plausible models with extensive exploratory data analyses and with information from previous studies, and then computed posterior model probability to estimate model uncertainty. Parameter estimation and model uncertainty estimation were implemented simultaneously by Markov chain Monte Carlo (MCMC*) methods. We validated the methods using simulated data. We illustrated the methods using PM2.5 (particulate matter ≤ 2.5 μm in aerodynamic diameter) speciation data and mortality data from Phoenix, Arizona, and Houston, Texas. The Phoenix data included counts of cardiovascular deaths and daily PM2.5 speciation data from 1995-1997. The Houston data included respiratory mortality data and 24-hour PM2.5 speciation data sampled every six days from a region near the Houston Ship Channel in years 2002-2005. We also developed a Bayesian spatial multivariate receptor modeling approach that, while simultaneously dealing with the unknown number of sources and identifiability conditions, incorporated spatial correlations in the multipollutant data collected from multiple sites into the estimation of source profiles and contributions based on the discrete process convolution model for multivariate spatial processes. This new modeling approach was applied to 24-hour ambient air concentrations of 17 volatile organic compounds (VOCs) measured at nine monitoring sites in Harris County, Texas, during years 2000 to 2005. Simulation results indicated that our methods were accurate in identifying the true model and estimated parameters were close to the true values. The results from our methods agreed in general with previous studies on the source apportionment of the Phoenix data in terms of estimated source profiles and contributions. However, we had a greater number of statistically insignificant findings, which was likely a natural consequence of incorporating uncertainty in the estimated source contributions into the health-effects parameter estimation. For the Houston data, a model with five sources (that seemed to be Sulfate-Rich Secondary Aerosol, Motor Vehicles, Industrial Combustion, Soil/Crustal Matter, and Sea Salt) showed the highest posterior model probability among the candidate models considered when fitted simultaneously to the PM2.5 and mortality data. There was a statistically significant positive association between respiratory mortality and same-day PM2.5 concentrations attributed to one of the sources (probably industrial combustion). The Bayesian spatial multivariate receptor modeling approach applied to the VOC data led to a highest posterior model probability for a model with five sources (that seemed to be refinery, petrochemical production, gasoline evaporation, natural gas, and vehicular exhaust) among several candidate models, with the number of sources varying between three and seven and with different identifiability conditions. Our multipollutant approach assessing source-specific health effects is more advantageous than a single-pollutant approach in that it can estimate total health effects from multiple pollutants and can also identify emission sources that are responsible for adverse health effects. Our Bayesian approach can incorporate not only uncertainty in the estimated source contributions, but also model uncertainty that has not been addressed in previous studies on assessing source-specific health effects. The new Bayesian spatial multivariate receptor modeling approach enables predictions of source contributions at unmonitored sites, minimizing exposure misclassification and providing improved exposure estimates along with their uncertainty estimates, as well as accounting for uncertainty in the number of sources and identifiability conditions.

第2部分。发展改进的统计方法,以评估与数量未知的多种空气污染物主要来源有关的健康影响。
评估特定源的健康影响的一个主要困难是无法直接测量特定源的接触;相反,它们需要通过多变量受体建模等源分配方法进行估计。在以往的研究中,源分配的不确定性(源特异性暴露估计的不确定性以及由于源数量和可识别性条件未知而产生的模型不确定性)在很大程度上被忽略。此外,从多个监测点收集的多污染物数据的空间依赖性尚未纳入多变量受体模型。该项目的目标是:(1)开发一种多污染物方法,将源分配中的两种不确定性纳入源特定健康影响的评估中;(2)开发增强的多变量受体模型,可以解释从多个地点收集的多污染物数据的空间相关性。我们采用贝叶斯分层建模框架,包括多变量受体模型、健康效应模型和潜在源贡献的分层模型。对于健康模型,我们在这个项目中侧重于时间序列设计。源数量和可识别性条件(对模型参数的附加约束)的每种组合定义了不同的模型。我们通过广泛的探索性数据分析和以往研究的信息建立了一套合理的模型,然后计算模型的后验概率来估计模型的不确定性。采用马尔可夫链蒙特卡罗(MCMC*)方法同时实现参数估计和模型不确定性估计。我们用模拟数据验证了这些方法。我们使用来自亚利桑那州凤凰城和德克萨斯州休斯顿的PM2.5(空气动力学直径≤2.5 μm的颗粒物)物种数据和死亡率数据来说明方法。凤凰城的数据包括1995年至1997年心血管死亡人数和每日PM2.5形态数据。休斯顿的数据包括2002-2005年间呼吸死亡率数据和每6天从休斯顿航道附近地区采样的24小时PM2.5物种形成数据。我们还开发了一种贝叶斯空间多变量受体建模方法,该方法在同时处理未知数量的源和可识别性条件的同时,将从多个地点收集的多污染物数据中的空间相关性纳入到基于多变量空间过程的离散过程卷积模型的源剖面和贡献估计中。这种新的建模方法应用于2000年至2005年期间在德克萨斯州哈里斯县的9个监测点测量的17种挥发性有机化合物(VOCs)的24小时环境空气浓度。仿真结果表明,我们的方法能够准确地识别真实模型,估计的参数接近真实值。从我们的方法得到的结果在估计源剖面和贡献方面与先前关于凤凰号数据源分配的研究大体一致。然而,我们有更多统计上不显著的发现,这可能是将估计源贡献的不确定性纳入健康影响参数估计的自然结果。对于休斯顿的数据,当同时拟合PM2.5和死亡率数据时,具有五个来源(似乎是富含硫酸盐的二次气溶胶、机动车、工业燃烧、土壤/地壳物质和海盐)的模型在考虑的候选模型中显示出最高的后验模型概率。其中一个来源(可能是工业燃烧)造成的呼吸道死亡率与当日PM2.5浓度之间存在统计学上显著的正相关。应用于VOC数据的贝叶斯空间多变量受体建模方法导致在几个候选模型中,具有五个来源(似乎是炼油厂,石化生产,汽油蒸发,天然气和汽车尾气)的模型的后验模型概率最高,来源数量在3到7之间变化,并且具有不同的可识别条件。我们评估特定污染源对健康影响的多污染物方法比单一污染物方法更有利,因为它可以估计多种污染物对健康的总体影响,还可以确定造成不利健康影响的排放源。我们的贝叶斯方法不仅可以纳入估计源贡献的不确定性,还可以纳入以前评估源特定健康影响的研究中未解决的模型不确定性。 新的贝叶斯空间多变量受体建模方法能够预测未监测站点的源贡献,最大限度地减少暴露错误分类,并提供改进的暴露估计及其不确定性估计,以及考虑源数量和可识别性条件的不确定性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信