A graph neural network and Transformer-based model for PM2.5 prediction through spatiotemporal correlation

IF 4.8 2区 环境科学与生态学 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Yao Ye , Yong Cao , Yibo Dong , Hua Yan
{"title":"A graph neural network and Transformer-based model for PM2.5 prediction through spatiotemporal correlation","authors":"Yao Ye ,&nbsp;Yong Cao ,&nbsp;Yibo Dong ,&nbsp;Hua Yan","doi":"10.1016/j.envsoft.2025.106501","DOIUrl":null,"url":null,"abstract":"<div><div>It is important for both urban residents and government agencies to accurately predict the concentration of fine particulate matter (PM2.5) in the atmosphere. In existing research, various traditional and hybrid network models have been applied and developed, all of which have played a positive role in the prediction of PM2.5 concentration. Despite Transformer-based networks demonstrating unique advantages in time series prediction tasks, the Transformer architecture faces challenges related to inadequate extraction of spatiotemporal features and susceptibility to interference from irrelevant data. To address these challenges, a graph neural network (GNN) and Transformer-based model for PM2.5 concentration prediction, named GNN-Transformer, is proposed. Firstly, an instantaneous phase synchronization-based estimator is designed to mitigate the negative influence of irrelevant data on prediction performance. Subsequently, a spatial impact modeling layer based on GNN is introduced to extract spatial impacts between the target city and its surrounding cities. Finally, a spatiotemporal prediction module based on Transformer is devised to further extract the spatiotemporal features between the target city and its surrounding cities, and generate more accurate predictions of PM2.5 concentration. Experiments conducted on real-world datasets demonstrate that the proposed GNN-Transformer outperforms other models in both short and long term prediction task. Specifically, for 3-h prediction task, the proposed model achieves the lowest Mean Absolute Error (MAE) of 6.35 and the highest R<sup>2</sup> of 0.97. Additionally, the proposed model exhibits superior performance in multiscale prediction tasks across different time spans, achieving the best results for 24-h prediction task (MAE = 18.66, R<sup>2</sup> = 0.76). Furthermore, the proposed method exhibits the capability to accurately predict high PM2.5 concentration, achieving the highest Critical Success Index (CSI) and Probability of Detection (POD), along with the lowest False Alarm Ratio (FAR). This performance may enable early warnings for potential air pollution events.</div></div>","PeriodicalId":310,"journal":{"name":"Environmental Modelling & Software","volume":"191 ","pages":"Article 106501"},"PeriodicalIF":4.8000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Modelling & Software","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1364815225001859","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

It is important for both urban residents and government agencies to accurately predict the concentration of fine particulate matter (PM2.5) in the atmosphere. In existing research, various traditional and hybrid network models have been applied and developed, all of which have played a positive role in the prediction of PM2.5 concentration. Despite Transformer-based networks demonstrating unique advantages in time series prediction tasks, the Transformer architecture faces challenges related to inadequate extraction of spatiotemporal features and susceptibility to interference from irrelevant data. To address these challenges, a graph neural network (GNN) and Transformer-based model for PM2.5 concentration prediction, named GNN-Transformer, is proposed. Firstly, an instantaneous phase synchronization-based estimator is designed to mitigate the negative influence of irrelevant data on prediction performance. Subsequently, a spatial impact modeling layer based on GNN is introduced to extract spatial impacts between the target city and its surrounding cities. Finally, a spatiotemporal prediction module based on Transformer is devised to further extract the spatiotemporal features between the target city and its surrounding cities, and generate more accurate predictions of PM2.5 concentration. Experiments conducted on real-world datasets demonstrate that the proposed GNN-Transformer outperforms other models in both short and long term prediction task. Specifically, for 3-h prediction task, the proposed model achieves the lowest Mean Absolute Error (MAE) of 6.35 and the highest R2 of 0.97. Additionally, the proposed model exhibits superior performance in multiscale prediction tasks across different time spans, achieving the best results for 24-h prediction task (MAE = 18.66, R2 = 0.76). Furthermore, the proposed method exhibits the capability to accurately predict high PM2.5 concentration, achieving the highest Critical Success Index (CSI) and Probability of Detection (POD), along with the lowest False Alarm Ratio (FAR). This performance may enable early warnings for potential air pollution events.
基于图神经网络和transformer的PM2.5时空相关性预测模型
准确预测大气中细颗粒物(PM2.5)的浓度对城市居民和政府机构都很重要。在现有的研究中,各种传统网络模型和混合网络模型得到了应用和发展,这些模型都在PM2.5浓度的预测中发挥了积极的作用。尽管基于Transformer的网络在时间序列预测任务中显示出独特的优势,但Transformer架构面临着与时空特征提取不足和对无关数据干扰的敏感性相关的挑战。为了解决这些挑战,提出了一种基于图神经网络(GNN)和transformer的PM2.5浓度预测模型,称为GNN- transformer。首先,设计了基于瞬时相位同步的估计器,以减轻不相关数据对预测性能的负面影响。随后,引入基于GNN的空间影响建模层,提取目标城市与周边城市之间的空间影响。最后,设计了基于Transformer的时空预测模块,进一步提取目标城市与周边城市的时空特征,对PM2.5浓度进行更准确的预测。在实际数据集上进行的实验表明,所提出的GNN-Transformer在短期和长期预测任务中都优于其他模型。具体而言,对于3小时预测任务,该模型的平均绝对误差(MAE)最低为6.35,R2最高为0.97。此外,该模型在不同时间跨度的多尺度预测任务中表现出优异的性能,在24小时预测任务中效果最好(MAE = 18.66, R2 = 0.76)。此外,该方法能够准确预测高PM2.5浓度,实现最高的关键成功指数(CSI)和检测概率(POD),以及最低的虚警率(FAR)。这种性能可以对潜在的空气污染事件进行早期预警。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Environmental Modelling & Software
Environmental Modelling & Software 工程技术-工程:环境
CiteScore
9.30
自引率
8.20%
发文量
241
审稿时长
60 days
期刊介绍: Environmental Modelling & Software publishes contributions, in the form of research articles, reviews and short communications, on recent advances in environmental modelling and/or software. The aim is to improve our capacity to represent, understand, predict or manage the behaviour of environmental systems at all practical scales, and to communicate those improvements to a wide scientific and professional audience.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信