Quantitative reconstruction of long-term spatiotemporal patterns of high-resolution ground-level NO2 concentrations in mainland China using fusion techniques and a machine learning framework
{"title":"Quantitative reconstruction of long-term spatiotemporal patterns of high-resolution ground-level NO2 concentrations in mainland China using fusion techniques and a machine learning framework","authors":"Zhen Li , Heng Dong , Sicong He , Huan Huang","doi":"10.1016/j.envint.2025.109672","DOIUrl":null,"url":null,"abstract":"<div><div>Nitrogen dioxide (NO<sub>2</sub>), as a critical trace gas, plays multiple roles in the atmosphere and poses potential threats to human health. However, existing satellite monitoring methods face challenges, including limited satellite mission durations, poor data quality, and low spatial resolution, which hinder the ability to provide long-term, high-precision NO<sub>2</sub> information. To address these issues, this study uses the TROPOMI tropospheric NO<sub>2</sub> column concentration product as a baseline and employs partition and cumulative distribution function (CDF) techniques to generate a satellite fusion dataset with both long time spans and high consistency. Based on this dataset, a high-performance, high-spatial-resolution long-term surface NO<sub>2</sub> estimation model was developed using machine learning algorithms combined with multi-source geographic data. The model successfully estimates daily average near-surface NO<sub>2</sub> concentrations (1 km<sup>2</sup> resolution) for mainland China from 2014 to 2020. The results show that the proposed fusion method effectively integrates OMI and TROPOMI data, improves the spatial correlation between satellite products by 16.2 % (R = 0.74 → 0.86), significantly enhances the spatial coverage, and thus more accurately characterizes the spatial distribution characteristics of NO<sub>2</sub>. The surface-level NO<sub>2</sub> estimates based on the LGBM model achieved an R<sup>2</sup> of 0.85 in ten-fold cross-validation, with corresponding root mean square error (RMSE) and mean absolute error (MAE) of 7.51 µg/m<sup>3</sup> and 5.22 µg/m<sup>3</sup>, respectively, demonstrating good extrapolation ability for temporal variations. The long-time series results accurately reflect the temporal and spatial evolution of NO<sub>2</sub> in mainland China, while the high-precision estimates provide detailed pollution exposure information, revealing urban-scale pollution differences and seasonal variations.</div></div>","PeriodicalId":308,"journal":{"name":"Environment International","volume":"202 ","pages":"Article 109672"},"PeriodicalIF":10.3000,"publicationDate":"2025-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environment International","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0160412025004234","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0
Abstract
Nitrogen dioxide (NO2), as a critical trace gas, plays multiple roles in the atmosphere and poses potential threats to human health. However, existing satellite monitoring methods face challenges, including limited satellite mission durations, poor data quality, and low spatial resolution, which hinder the ability to provide long-term, high-precision NO2 information. To address these issues, this study uses the TROPOMI tropospheric NO2 column concentration product as a baseline and employs partition and cumulative distribution function (CDF) techniques to generate a satellite fusion dataset with both long time spans and high consistency. Based on this dataset, a high-performance, high-spatial-resolution long-term surface NO2 estimation model was developed using machine learning algorithms combined with multi-source geographic data. The model successfully estimates daily average near-surface NO2 concentrations (1 km2 resolution) for mainland China from 2014 to 2020. The results show that the proposed fusion method effectively integrates OMI and TROPOMI data, improves the spatial correlation between satellite products by 16.2 % (R = 0.74 → 0.86), significantly enhances the spatial coverage, and thus more accurately characterizes the spatial distribution characteristics of NO2. The surface-level NO2 estimates based on the LGBM model achieved an R2 of 0.85 in ten-fold cross-validation, with corresponding root mean square error (RMSE) and mean absolute error (MAE) of 7.51 µg/m3 and 5.22 µg/m3, respectively, demonstrating good extrapolation ability for temporal variations. The long-time series results accurately reflect the temporal and spatial evolution of NO2 in mainland China, while the high-precision estimates provide detailed pollution exposure information, revealing urban-scale pollution differences and seasonal variations.
期刊介绍:
Environmental Health publishes manuscripts focusing on critical aspects of environmental and occupational medicine, including studies in toxicology and epidemiology, to illuminate the human health implications of exposure to environmental hazards. The journal adopts an open-access model and practices open peer review.
It caters to scientists and practitioners across all environmental science domains, directly or indirectly impacting human health and well-being. With a commitment to enhancing the prevention of environmentally-related health risks, Environmental Health serves as a public health journal for the community and scientists engaged in matters of public health significance concerning the environment.