Heterogeneous graphical model for non-negative and non-Gaussian PM 2.5 data

IF 1 4区 数学 Q3 STATISTICS & PROBABILITY
Jiaqi Zhang, Xinyan Fan, Yang Li, Shuangge Ma
{"title":"Heterogeneous graphical model for non-negative and non-Gaussian \n \n \n PM\n 2.5\n \n data","authors":"Jiaqi Zhang,&nbsp;Xinyan Fan,&nbsp;Yang Li,&nbsp;Shuangge Ma","doi":"10.1111/rssc.12575","DOIUrl":null,"url":null,"abstract":"<p>Studies on the conditional relationships between \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> concentrations among different regions are of great interest for the joint prevention and control of air pollution. Because of seasonal changes in atmospheric conditions, spatial patterns of \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> may differ throughout the year. Additionally, concentration data are both non-negative and non-Gaussian. These data features pose significant challenges to existing methods. This study proposes a heterogeneous graphical model for non-negative and non-Gaussian data via the score matching loss. The proposed method simultaneously clusters multiple datasets and estimates a graph for variables with complex properties in each cluster. Furthermore, our model involves a network that indicate similarity among datasets, and this network can have additional applications. In simulation studies, the proposed method outperforms competing alternatives in both clustering and edge identification. We also analyse the \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> concentrations' spatial correlations in Taiwan's regions using data obtained in year 2019 from 67 air-quality monitoring stations. The 12 months are clustered into four groups: January–March, April, May–September and October–December, and the corresponding graphs have 153, 57, 86 and 167 edges respectively. The results show obvious seasonality, which is consistent with the meteorological literature. Geographically, the \n<math>\n <mrow>\n <msub>\n <mtext>PM</mtext>\n <mn>2.5</mn>\n </msub>\n </mrow></math> concentrations of north and south Taiwan regions correlate more respectively. These results can provide valuable information for developing joint air-quality control strategies.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.0000,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of the Royal Statistical Society Series C-Applied Statistics","FirstCategoryId":"100","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1111/rssc.12575","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 1

Abstract

Studies on the conditional relationships between PM 2.5 concentrations among different regions are of great interest for the joint prevention and control of air pollution. Because of seasonal changes in atmospheric conditions, spatial patterns of PM 2.5 may differ throughout the year. Additionally, concentration data are both non-negative and non-Gaussian. These data features pose significant challenges to existing methods. This study proposes a heterogeneous graphical model for non-negative and non-Gaussian data via the score matching loss. The proposed method simultaneously clusters multiple datasets and estimates a graph for variables with complex properties in each cluster. Furthermore, our model involves a network that indicate similarity among datasets, and this network can have additional applications. In simulation studies, the proposed method outperforms competing alternatives in both clustering and edge identification. We also analyse the PM 2.5 concentrations' spatial correlations in Taiwan's regions using data obtained in year 2019 from 67 air-quality monitoring stations. The 12 months are clustered into four groups: January–March, April, May–September and October–December, and the corresponding graphs have 153, 57, 86 and 167 edges respectively. The results show obvious seasonality, which is consistent with the meteorological literature. Geographically, the PM 2.5 concentrations of north and south Taiwan regions correlate more respectively. These results can provide valuable information for developing joint air-quality control strategies.

非负和非高斯pm2.5数据的异构图形模型
研究不同区域间pm2.5浓度的条件关系对大气污染联防联控具有重要意义。由于大气条件的季节性变化,pm2.5的空间分布在全年可能有所不同。此外,浓度数据是非负的和非高斯的。这些数据特征对现有方法提出了重大挑战。本研究提出了一种基于分数匹配损失的非负和非高斯数据的异构图形模型。该方法同时对多个数据集进行聚类,并对每个聚类中具有复杂属性的变量进行图估计。此外,我们的模型涉及一个网络,表明数据集之间的相似性,这个网络可以有额外的应用。在仿真研究中,该方法在聚类和边缘识别方面都优于竞争方案。我们还利用2019年67个空气质量监测站的数据分析了台湾地区pm2.5浓度的空间相关性。将12个月聚为1 - 3月、4月、5 - 9月和10 - 12月四组,对应的图分别有153条、57条、86条和167条边。结果显示出明显的季节性,这与气象文献一致。在地理上,台湾北部和南部地区的pm2.5浓度相关性更强。这些结果可以为制定联合空气质量控制策略提供有价值的信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
2.50
自引率
0.00%
发文量
76
审稿时长
>12 weeks
期刊介绍: The Journal of the Royal Statistical Society, Series C (Applied Statistics) is a journal of international repute for statisticians both inside and outside the academic world. The journal is concerned with papers which deal with novel solutions to real life statistical problems by adapting or developing methodology, or by demonstrating the proper application of new or existing statistical methods to them. At their heart therefore the papers in the journal are motivated by examples and statistical data of all kinds. The subject-matter covers the whole range of inter-disciplinary fields, e.g. applications in agriculture, genetics, industry, medicine and the physical sciences, and papers on design issues (e.g. in relation to experiments, surveys or observational studies). A deep understanding of statistical methodology is not necessary to appreciate the content. Although papers describing developments in statistical computing driven by practical examples are within its scope, the journal is not concerned with simply numerical illustrations or simulation studies. The emphasis of Series C is on case-studies of statistical analyses in practice.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信