LGHAP v2:通过地球大数据分析得出的 2000 年以来全球无间隙气溶胶光学深度和 PM2.5 浓度数据集

IF 11.2 1区 地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY
Kaixu Bai, Ke Li, Liuqing Shao, Xinran Li, Chaoshun Liu, Zhengqiang Li, Mingliang Ma, Di Han, Yibing Sun, Zhe Zheng, Ruijie Li, Ni-Bin Chang, Jianping Guo
{"title":"LGHAP v2:通过地球大数据分析得出的 2000 年以来全球无间隙气溶胶光学深度和 PM2.5 浓度数据集","authors":"Kaixu Bai, Ke Li, Liuqing Shao, Xinran Li, Chaoshun Liu, Zhengqiang Li, Mingliang Ma, Di Han, Yibing Sun, Zhe Zheng, Ruijie Li, Ni-Bin Chang, Jianping Guo","doi":"10.5194/essd-16-2425-2024","DOIUrl":null,"url":null,"abstract":"Abstract. The Long-term Gap-free High-resolution Air Pollutants (LGHAP) concentration dataset generated in our previous study has provided spatially contiguous daily aerosol optical depth (AOD) and fine particulate matter (PM2.5) concentrations at a 1 km grid resolution in China since 2000. This advancement empowered unprecedented assessments of regional aerosol variations and their influence on the environment, health, and climate over the past 20 years. However, there is a need to enhance such a high-quality AOD and PM2.5 concentration dataset with new robust features and extended spatial coverage. In this study, we present version 2 of a global-scale LGHAP dataset (LGHAP v2), which was generated using improved big Earth data analytics via a seamless integration of versatile data science, pattern recognition, and machine learning methods. Specifically, multimodal AODs and air quality measurements acquired from relevant satellites, ground monitoring stations, and numerical models were harmonized by harnessing the capability of random-forest-based data-driven models. Subsequently, an improved tensor-flow-based AOD reconstruction algorithm was developed to weave the harmonized multisource AOD products together for filling data gaps in Multi-Angle Implementation of Atmospheric Correction (MAIAC) AOD retrievals from Terra. The results of the ablation experiments demonstrated better performance of the improved tensor-flow-based gap-filling method in terms of both convergence speed and data accuracy. Ground-based validation results indicated good data accuracy of this global gap-free AOD dataset, with a correlation coefficient (R) of 0.85 and a root mean square error (RMSE) of 0.14 compared to the worldwide AOD observations from the AErosol RObotic NETwork (AERONET), outperforming the purely reconstructed AODs (R = 0.83, RMSE = 0.15), but they were slightly worse than raw MAIAC AOD retrievals (R = 0.88, RMSE = 0.11). For PM2.5 concentration mapping, a novel deep-learning approach, termed the SCene-Aware ensemble learning Graph ATtention network (SCAGAT), was hereby applied. While accounting for the scene representativeness of data-driven models across regions, the SCAGAT algorithm performed better during spatial extrapolation, largely reducing modeling biases over regions with limited and/or even absent in situ PM2.5 concentration measurements. The validation results indicated that the gap-free PM2.5 concentration estimates exhibit higher prediction accuracies, with an R of 0.95 and an RMSE of 5.7 µg m−3, compared to PM2.5 concentration measurements obtained from former holdout sites worldwide. Overall, while leveraging state-of-the-art methods in data science and artificial intelligence, a quality-enhanced LGHAP v2 dataset was generated through big Earth data analytics by cohesively weaving together multimodal AODs and air quality measurements from diverse sources. The gap-free, high-resolution, and global coverage merits render the LGHAP v2 dataset an invaluable database for advancing aerosol- and haze-related studies as well as triggering multidisciplinary applications for environmental management, health-risk assessment, and climate change attribution. All gap-free AOD and PM2.5 concentration grids in the LGHAP v2 dataset, as well as the data user guide and relevant visualization codes, are publicly accessible at https://zenodo.org/communities/ecnu_lghap (last access: 3 April 2024, Bai and Li, 2023a).","PeriodicalId":48747,"journal":{"name":"Earth System Science Data","volume":"17 1","pages":""},"PeriodicalIF":11.2000,"publicationDate":"2024-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"LGHAP v2: a global gap-free aerosol optical depth and PM2.5 concentration dataset since 2000 derived via big Earth data analytics\",\"authors\":\"Kaixu Bai, Ke Li, Liuqing Shao, Xinran Li, Chaoshun Liu, Zhengqiang Li, Mingliang Ma, Di Han, Yibing Sun, Zhe Zheng, Ruijie Li, Ni-Bin Chang, Jianping Guo\",\"doi\":\"10.5194/essd-16-2425-2024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract. The Long-term Gap-free High-resolution Air Pollutants (LGHAP) concentration dataset generated in our previous study has provided spatially contiguous daily aerosol optical depth (AOD) and fine particulate matter (PM2.5) concentrations at a 1 km grid resolution in China since 2000. This advancement empowered unprecedented assessments of regional aerosol variations and their influence on the environment, health, and climate over the past 20 years. However, there is a need to enhance such a high-quality AOD and PM2.5 concentration dataset with new robust features and extended spatial coverage. In this study, we present version 2 of a global-scale LGHAP dataset (LGHAP v2), which was generated using improved big Earth data analytics via a seamless integration of versatile data science, pattern recognition, and machine learning methods. Specifically, multimodal AODs and air quality measurements acquired from relevant satellites, ground monitoring stations, and numerical models were harmonized by harnessing the capability of random-forest-based data-driven models. Subsequently, an improved tensor-flow-based AOD reconstruction algorithm was developed to weave the harmonized multisource AOD products together for filling data gaps in Multi-Angle Implementation of Atmospheric Correction (MAIAC) AOD retrievals from Terra. The results of the ablation experiments demonstrated better performance of the improved tensor-flow-based gap-filling method in terms of both convergence speed and data accuracy. Ground-based validation results indicated good data accuracy of this global gap-free AOD dataset, with a correlation coefficient (R) of 0.85 and a root mean square error (RMSE) of 0.14 compared to the worldwide AOD observations from the AErosol RObotic NETwork (AERONET), outperforming the purely reconstructed AODs (R = 0.83, RMSE = 0.15), but they were slightly worse than raw MAIAC AOD retrievals (R = 0.88, RMSE = 0.11). For PM2.5 concentration mapping, a novel deep-learning approach, termed the SCene-Aware ensemble learning Graph ATtention network (SCAGAT), was hereby applied. While accounting for the scene representativeness of data-driven models across regions, the SCAGAT algorithm performed better during spatial extrapolation, largely reducing modeling biases over regions with limited and/or even absent in situ PM2.5 concentration measurements. The validation results indicated that the gap-free PM2.5 concentration estimates exhibit higher prediction accuracies, with an R of 0.95 and an RMSE of 5.7 µg m−3, compared to PM2.5 concentration measurements obtained from former holdout sites worldwide. Overall, while leveraging state-of-the-art methods in data science and artificial intelligence, a quality-enhanced LGHAP v2 dataset was generated through big Earth data analytics by cohesively weaving together multimodal AODs and air quality measurements from diverse sources. The gap-free, high-resolution, and global coverage merits render the LGHAP v2 dataset an invaluable database for advancing aerosol- and haze-related studies as well as triggering multidisciplinary applications for environmental management, health-risk assessment, and climate change attribution. All gap-free AOD and PM2.5 concentration grids in the LGHAP v2 dataset, as well as the data user guide and relevant visualization codes, are publicly accessible at https://zenodo.org/communities/ecnu_lghap (last access: 3 April 2024, Bai and Li, 2023a).\",\"PeriodicalId\":48747,\"journal\":{\"name\":\"Earth System Science Data\",\"volume\":\"17 1\",\"pages\":\"\"},\"PeriodicalIF\":11.2000,\"publicationDate\":\"2024-05-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Earth System Science Data\",\"FirstCategoryId\":\"89\",\"ListUrlMain\":\"https://doi.org/10.5194/essd-16-2425-2024\",\"RegionNum\":1,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"GEOSCIENCES, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Earth System Science Data","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.5194/essd-16-2425-2024","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

摘要自 2000 年以来,我们先前研究中生成的长期无间隙高分辨率空气污染物(LGHAP)浓度数据集提供了中国 1 公里网格分辨率下空间连续的日气溶胶光学深度(AOD)和细颗粒物(PM2.5)浓度。这一进步使我们能够对过去 20 年的区域气溶胶变化及其对环境、健康和气候的影响进行前所未有的评估。然而,有必要利用新的强大功能和扩展的空间覆盖范围来增强这样一个高质量的 AOD 和 PM2.5 浓度数据集。在本研究中,我们介绍了全球尺度 LGHAP 数据集的第二版(LGHAP v2),该数据集是通过无缝集成多功能数据科学、模式识别和机器学习方法,利用改进的大地球数据分析技术生成的。具体而言,通过利用基于随机森林的数据驱动模型的能力,协调了从相关卫星、地面监测站和数值模型获取的多模态 AOD 和空气质量测量数据。随后,开发了一种改进的基于张量流的 AOD 重建算法,将协调后的多源 AOD 产品编织在一起,以填补从 Terra 进行多角度大气校正(MAIAC)AOD 检索的数据缺口。消融实验结果表明,基于张量流的改进型间隙填充方法在收敛速度和数据准确性方面都有更好的表现。地面验证结果表明,该全球无间隙 AOD 数据集具有良好的数据准确性,与全球 AOD 观测数据相比,相关系数(R)为 0.85,均方根误差(RMSE)为 0.14。14,优于纯重建的 AOD(R = 0.83,RMSE = 0.15),但略逊于原始 MAIAC AOD 检索(R = 0.88,RMSE = 0.11)。在绘制 PM2.5 浓度图时,采用了一种新颖的深度学习方法,称为 "场景感知集合学习图形 ATtention 网络(SCAGAT)"。在考虑跨区域数据驱动模型的场景代表性的同时,SCAGAT 算法在空间外推过程中表现更佳,在很大程度上减少了原位 PM2.5 浓度测量值有限和/或甚至不存在的区域的建模偏差。验证结果表明,无间隙 PM2.5 浓度估计值与从全球前保留站点获得的 PM2.5 浓度测量值相比,具有更高的预测准确性,R 值为 0.95,RMSE 为 5.7 µg m-3。总之,在利用数据科学和人工智能领域最先进方法的同时,通过大地球数据分析,将来自不同来源的多模态 AODs 和空气质量测量数据凝聚在一起,生成了一个质量增强的 LGHAP v2 数据集。LGHAP v2 数据集具有无间隙、高分辨率和全球覆盖等优点,是推进气溶胶和灰霾相关研究的宝贵数据库,并可引发环境管理、健康风险评估和气候变化归因等多学科应用。LGHAP v2数据集中的所有无间隙AOD和PM2.5浓度网格以及数据用户指南和相关可视化代码均可在https://zenodo.org/communities/ecnu_lghap(最后访问日期:2024年4月3日,Bai和Li,2023a)上公开访问。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
LGHAP v2: a global gap-free aerosol optical depth and PM2.5 concentration dataset since 2000 derived via big Earth data analytics
Abstract. The Long-term Gap-free High-resolution Air Pollutants (LGHAP) concentration dataset generated in our previous study has provided spatially contiguous daily aerosol optical depth (AOD) and fine particulate matter (PM2.5) concentrations at a 1 km grid resolution in China since 2000. This advancement empowered unprecedented assessments of regional aerosol variations and their influence on the environment, health, and climate over the past 20 years. However, there is a need to enhance such a high-quality AOD and PM2.5 concentration dataset with new robust features and extended spatial coverage. In this study, we present version 2 of a global-scale LGHAP dataset (LGHAP v2), which was generated using improved big Earth data analytics via a seamless integration of versatile data science, pattern recognition, and machine learning methods. Specifically, multimodal AODs and air quality measurements acquired from relevant satellites, ground monitoring stations, and numerical models were harmonized by harnessing the capability of random-forest-based data-driven models. Subsequently, an improved tensor-flow-based AOD reconstruction algorithm was developed to weave the harmonized multisource AOD products together for filling data gaps in Multi-Angle Implementation of Atmospheric Correction (MAIAC) AOD retrievals from Terra. The results of the ablation experiments demonstrated better performance of the improved tensor-flow-based gap-filling method in terms of both convergence speed and data accuracy. Ground-based validation results indicated good data accuracy of this global gap-free AOD dataset, with a correlation coefficient (R) of 0.85 and a root mean square error (RMSE) of 0.14 compared to the worldwide AOD observations from the AErosol RObotic NETwork (AERONET), outperforming the purely reconstructed AODs (R = 0.83, RMSE = 0.15), but they were slightly worse than raw MAIAC AOD retrievals (R = 0.88, RMSE = 0.11). For PM2.5 concentration mapping, a novel deep-learning approach, termed the SCene-Aware ensemble learning Graph ATtention network (SCAGAT), was hereby applied. While accounting for the scene representativeness of data-driven models across regions, the SCAGAT algorithm performed better during spatial extrapolation, largely reducing modeling biases over regions with limited and/or even absent in situ PM2.5 concentration measurements. The validation results indicated that the gap-free PM2.5 concentration estimates exhibit higher prediction accuracies, with an R of 0.95 and an RMSE of 5.7 µg m−3, compared to PM2.5 concentration measurements obtained from former holdout sites worldwide. Overall, while leveraging state-of-the-art methods in data science and artificial intelligence, a quality-enhanced LGHAP v2 dataset was generated through big Earth data analytics by cohesively weaving together multimodal AODs and air quality measurements from diverse sources. The gap-free, high-resolution, and global coverage merits render the LGHAP v2 dataset an invaluable database for advancing aerosol- and haze-related studies as well as triggering multidisciplinary applications for environmental management, health-risk assessment, and climate change attribution. All gap-free AOD and PM2.5 concentration grids in the LGHAP v2 dataset, as well as the data user guide and relevant visualization codes, are publicly accessible at https://zenodo.org/communities/ecnu_lghap (last access: 3 April 2024, Bai and Li, 2023a).
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Earth System Science Data
Earth System Science Data GEOSCIENCES, MULTIDISCIPLINARYMETEOROLOGY-METEOROLOGY & ATMOSPHERIC SCIENCES
CiteScore
18.00
自引率
5.30%
发文量
231
审稿时长
35 weeks
期刊介绍: Earth System Science Data (ESSD) is an international, interdisciplinary journal that publishes articles on original research data in order to promote the reuse of high-quality data in the field of Earth system sciences. The journal welcomes submissions of original data or data collections that meet the required quality standards and have the potential to contribute to the goals of the journal. It includes sections dedicated to regular-length articles, brief communications (such as updates to existing data sets), commentaries, review articles, and special issues. ESSD is abstracted and indexed in several databases, including Science Citation Index Expanded, Current Contents/PCE, Scopus, ADS, CLOCKSS, CNKI, DOAJ, EBSCO, Gale/Cengage, GoOA (CAS), and Google Scholar, among others.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信