Development and interpretation of PM2.5 estimation model for the Seoul Metropolitan Area using machine learning and explainable AI

IF 3.5 3区 环境科学与生态学 Q2 ENVIRONMENTAL SCIENCES
Myeong-Gyun Kim , Se-Young Kim , Kwangyul Lee , Pilho Kim , Hyoseon Kim , Hyo-Jong Song
{"title":"Development and interpretation of PM2.5 estimation model for the Seoul Metropolitan Area using machine learning and explainable AI","authors":"Myeong-Gyun Kim ,&nbsp;Se-Young Kim ,&nbsp;Kwangyul Lee ,&nbsp;Pilho Kim ,&nbsp;Hyoseon Kim ,&nbsp;Hyo-Jong Song","doi":"10.1016/j.apr.2025.102672","DOIUrl":null,"url":null,"abstract":"<div><div>PM<sub>2.5</sub> is emitted and formed in the atmosphere through various factors, posing significant health risks to humans. Therefore, accurately estimating PM<sub>2.5</sub> concentrations and analyzing the contributions of individual factors are crucial. A Deep Neural Network (DNN) model was developed for PM<sub>2.5</sub> estimation in the Seoul Metropolitan Area in South Korea, while some machine learning models—Random Forest and Extreme Gradient Boosting—were also built for performance comparison. Among these, the DNN model demonstrated the best performance, with an R<sup>2</sup> of 0.95, MSE of 12.14, and MAE of 2.6. Based on this, Explainable Artificial Intelligence (XAI) techniques, including Vanilla Gradient and Shapley Additive Explanation (SHAP), were applied to interpret the PM<sub>2.5</sub> estimation model and analyze the contribution of each factor. The contribution analysis for the Seoul Metropolitan Area revealed that NO<sub>3</sub><sup>−</sup> and NH<sub>4</sub><sup>+</sup> had the highest contributions to PM<sub>2.5</sub> formation, indicating that secondary formation mechanisms play a dominant role. Furthermore, at high concentrations, the contributions of NO<sub>3</sub><sup>−</sup>, NH<sub>4</sub><sup>+</sup>, and SO<sub>4</sub><sup>2−</sup> were the highest, and the contributions of metal components and PM<sub>10</sub> were higher than the average. In particular, it was observed that NH<sub>4</sub><sup>+</sup> and K showed a positive correlation with PM<sub>2.5</sub> formation. Future research will focus on refining the model through clustering-based approaches and other enhancements, aiming to deepen the understanding of PM<sub>2.5</sub> formation patterns and provide meaningful insights for policymaking.</div></div>","PeriodicalId":8604,"journal":{"name":"Atmospheric Pollution Research","volume":"16 11","pages":"Article 102672"},"PeriodicalIF":3.5000,"publicationDate":"2025-07-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmospheric Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1309104225002740","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

PM2.5 is emitted and formed in the atmosphere through various factors, posing significant health risks to humans. Therefore, accurately estimating PM2.5 concentrations and analyzing the contributions of individual factors are crucial. A Deep Neural Network (DNN) model was developed for PM2.5 estimation in the Seoul Metropolitan Area in South Korea, while some machine learning models—Random Forest and Extreme Gradient Boosting—were also built for performance comparison. Among these, the DNN model demonstrated the best performance, with an R2 of 0.95, MSE of 12.14, and MAE of 2.6. Based on this, Explainable Artificial Intelligence (XAI) techniques, including Vanilla Gradient and Shapley Additive Explanation (SHAP), were applied to interpret the PM2.5 estimation model and analyze the contribution of each factor. The contribution analysis for the Seoul Metropolitan Area revealed that NO3 and NH4+ had the highest contributions to PM2.5 formation, indicating that secondary formation mechanisms play a dominant role. Furthermore, at high concentrations, the contributions of NO3, NH4+, and SO42− were the highest, and the contributions of metal components and PM10 were higher than the average. In particular, it was observed that NH4+ and K showed a positive correlation with PM2.5 formation. Future research will focus on refining the model through clustering-based approaches and other enhancements, aiming to deepen the understanding of PM2.5 formation patterns and provide meaningful insights for policymaking.
利用机器学习和可解释的人工智能开发和解释首都圈PM2.5估算模型
PM2.5是通过各种因素在大气中排放和形成的,对人类健康构成重大威胁。因此,准确估算PM2.5浓度并分析各个因素的贡献至关重要。在韩国首尔市区开发了一个深度神经网络(DNN)模型来估计PM2.5,同时还建立了一些机器学习模型——随机森林和极端梯度增强——用于性能比较。其中,DNN模型表现最好,R2为0.95,MSE为12.14,MAE为2.6。在此基础上,应用可解释人工智能(Explainable Artificial Intelligence, XAI)技术,包括Vanilla Gradient和Shapley Additive Explanation (SHAP),对PM2.5估算模型进行解释,并分析各因子的贡献。对首都圈的贡献分析表明,NO3−和NH4+对PM2.5形成的贡献最大,表明次生形成机制起主导作用。高浓度时,NO3−、NH4+和SO42−的贡献最大,金属组分和PM10的贡献高于平均值。特别是,我们观察到NH4+和K与PM2.5的形成呈正相关。未来的研究将侧重于通过基于聚类的方法和其他增强方法来完善模型,旨在加深对PM2.5形成模式的理解,并为政策制定提供有意义的见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Atmospheric Pollution Research
Atmospheric Pollution Research ENVIRONMENTAL SCIENCES-
CiteScore
8.30
自引率
6.70%
发文量
256
审稿时长
36 days
期刊介绍: Atmospheric Pollution Research (APR) is an international journal designed for the publication of articles on air pollution. Papers should present novel experimental results, theory and modeling of air pollution on local, regional, or global scales. Areas covered are research on inorganic, organic, and persistent organic air pollutants, air quality monitoring, air quality management, atmospheric dispersion and transport, air-surface (soil, water, and vegetation) exchange of pollutants, dry and wet deposition, indoor air quality, exposure assessment, health effects, satellite measurements, natural emissions, atmospheric chemistry, greenhouse gases, and effects on climate change.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信