通过基于深度学习的异构城市数据整合预测长期空气污染物浓度

IF 3.9 3区环境科学与生态学 Q2 ENVIRONMENTAL SCIENCES

Atmospheric Pollution Research Pub Date : 2024-08-08 DOI:10.1016/j.apr.2024.102282

Chao Chen, Hui Liu, Chengming Yu

{"title":"通过基于深度学习的异构城市数据整合预测长期空气污染物浓度","authors":"Chao Chen, Hui Liu, Chengming Yu","doi":"10.1016/j.apr.2024.102282","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate prediction of air pollutant concentrations, specifically concerning inhalable particulate matter such as PM<sub>2.5</sub>, is crucial for proactive measures to safeguard the well-being of urban residents. This paper focuses on addressing the perceptible latency effect for long-term PM<sub>2.5</sub> predictions produced by existing statistical models. We emphasize the importance of numerical computations in capturing substantial changes, and enhance prediction accuracy by integrating them with high-dimensional, diverse urban data. Specifically, our approach collects data from a global-to-meso-scale atmospheric dispersion model named System for Integrated modeLling of Atmospheric coMposition (SILAM), along with numerical weather forecasts, traffic congestion measurement, meteorological factors and static sources (road network and points of interest). We find that existing deep learning models are prone to overfitting when applied to complex datasets, primarily due to their uniform treatment of diverse data types as time series without adapting to the specific characteristics of each data type. To counter this, we propose a simple yet transferable deep learning architecture, focusing on the proper use of various data types. Additionally, our comparative analysis, through a case study in Shenzhen, China, shows our model not only enhances SILAM dispersion accuracy for 24h-ahead PM<sub>2.5</sub> forecasts by a significant 30.3%, but also mitigates the noticeable latency effect of existing models by 19.5%. Finally, an ablation study further validates the importance of each data source and module of our approach.</p></div>","PeriodicalId":8604,"journal":{"name":"Atmospheric Pollution Research","volume":"15 11","pages":"Article 102282"},"PeriodicalIF":3.9000,"publicationDate":"2024-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Predicting long-term air pollutant concentrations through deep learning-based integration of heterogeneous urban data\",\"authors\":\"Chao Chen, Hui Liu, Chengming Yu\",\"doi\":\"10.1016/j.apr.2024.102282\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Accurate prediction of air pollutant concentrations, specifically concerning inhalable particulate matter such as PM<sub>2.5</sub>, is crucial for proactive measures to safeguard the well-being of urban residents. This paper focuses on addressing the perceptible latency effect for long-term PM<sub>2.5</sub> predictions produced by existing statistical models. We emphasize the importance of numerical computations in capturing substantial changes, and enhance prediction accuracy by integrating them with high-dimensional, diverse urban data. Specifically, our approach collects data from a global-to-meso-scale atmospheric dispersion model named System for Integrated modeLling of Atmospheric coMposition (SILAM), along with numerical weather forecasts, traffic congestion measurement, meteorological factors and static sources (road network and points of interest). We find that existing deep learning models are prone to overfitting when applied to complex datasets, primarily due to their uniform treatment of diverse data types as time series without adapting to the specific characteristics of each data type. To counter this, we propose a simple yet transferable deep learning architecture, focusing on the proper use of various data types. Additionally, our comparative analysis, through a case study in Shenzhen, China, shows our model not only enhances SILAM dispersion accuracy for 24h-ahead PM<sub>2.5</sub> forecasts by a significant 30.3%, but also mitigates the noticeable latency effect of existing models by 19.5%. Finally, an ablation study further validates the importance of each data source and module of our approach.</p></div>\",\"PeriodicalId\":8604,\"journal\":{\"name\":\"Atmospheric Pollution Research\",\"volume\":\"15 11\",\"pages\":\"Article 102282\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-08-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Atmospheric Pollution Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1309104224002472\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmospheric Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1309104224002472","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

准确预测空气污染物的浓度，特别是 PM2.5 等可吸入颗粒物的浓度，对于采取积极措施保障城市居民的福祉至关重要。本文的重点是解决现有统计模型对 PM2.5 长期预测所产生的可感知延迟效应。我们强调数值计算在捕捉实质性变化方面的重要性，并通过将数值计算与高维、多样化的城市数据相结合来提高预测的准确性。具体来说，我们的方法从一个名为 "大气协同定位综合模式系统（SILAM）"的全球至中尺度大气扩散模型中收集数据，同时收集数值天气预报、交通拥堵测量、气象因素和静态来源（道路网络和兴趣点）。我们发现，现有的深度学习模型在应用于复杂数据集时容易出现过拟合，这主要是由于它们将不同数据类型统一处理为时间序列，而没有适应每种数据类型的具体特征。为了解决这个问题，我们提出了一种简单但可转移的深度学习架构，重点是正确使用各种数据类型。此外，我们通过在中国深圳进行的案例研究进行了对比分析，结果表明我们的模型不仅将提前 24 小时预测 PM2.5 的 SILAM 分散精度大幅提高了 30.3%，还将现有模型的明显延迟效应降低了 19.5%。最后，一项消融研究进一步验证了我们方法中每个数据源和模块的重要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Predicting long-term air pollutant concentrations through deep learning-based integration of heterogeneous urban data

查看原文本刊更多论文

Predicting long-term air pollutant concentrations through deep learning-based integration of heterogeneous urban data

Accurate prediction of air pollutant concentrations, specifically concerning inhalable particulate matter such as PM_2.5, is crucial for proactive measures to safeguard the well-being of urban residents. This paper focuses on addressing the perceptible latency effect for long-term PM_2.5 predictions produced by existing statistical models. We emphasize the importance of numerical computations in capturing substantial changes, and enhance prediction accuracy by integrating them with high-dimensional, diverse urban data. Specifically, our approach collects data from a global-to-meso-scale atmospheric dispersion model named System for Integrated modeLling of Atmospheric coMposition (SILAM), along with numerical weather forecasts, traffic congestion measurement, meteorological factors and static sources (road network and points of interest). We find that existing deep learning models are prone to overfitting when applied to complex datasets, primarily due to their uniform treatment of diverse data types as time series without adapting to the specific characteristics of each data type. To counter this, we propose a simple yet transferable deep learning architecture, focusing on the proper use of various data types. Additionally, our comparative analysis, through a case study in Shenzhen, China, shows our model not only enhances SILAM dispersion accuracy for 24h-ahead PM_2.5 forecasts by a significant 30.3%, but also mitigates the noticeable latency effect of existing models by 19.5%. Finally, an ablation study further validates the importance of each data source and module of our approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Atmospheric Pollution Research ENVIRONMENTAL SCIENCES-

CiteScore

8.30

自引率

6.70%

发文量

256

审稿时长

36 days

期刊介绍： Atmospheric Pollution Research (APR) is an international journal designed for the publication of articles on air pollution. Papers should present novel experimental results, theory and modeling of air pollution on local, regional, or global scales. Areas covered are research on inorganic, organic, and persistent organic air pollutants, air quality monitoring, air quality management, atmospheric dispersion and transport, air-surface (soil, water, and vegetation) exchange of pollutants, dry and wet deposition, indoor air quality, exposure assessment, health effects, satellite measurements, natural emissions, atmospheric chemistry, greenhouse gases, and effects on climate change.