Machine learning-based prediction of ambient CO2 and CH4 concentrations with high temporal resolution in Seoul metropolitan area

IF 7.6 2区 环境科学与生态学 Q1 ENVIRONMENTAL SCIENCES
Seongjun Park, Kwang-Joo Moon, Hyo-Jin Eom, Seung-Muk Yi, Youngkwon Kim, Moonkyung Kim, Donghyun Rim, Young Su Lee
{"title":"Machine learning-based prediction of ambient CO2 and CH4 concentrations with high temporal resolution in Seoul metropolitan area","authors":"Seongjun Park, Kwang-Joo Moon, Hyo-Jin Eom, Seung-Muk Yi, Youngkwon Kim, Moonkyung Kim, Donghyun Rim, Young Su Lee","doi":"10.1016/j.envpol.2025.126362","DOIUrl":null,"url":null,"abstract":"Machine learning has the potential to support the growing need for high-resolution greenhouse gas monitoring in urban and industrial environments, where deploying extensive sensor networks is often limited by cost and operational challenges. This study presents a novel approach for estimating greenhouse gas (GHG) concentrations using routinely collected air quality and meteorological data from existing monitoring stations. Focusing on the Seoul metropolitan area, we developed and evaluated three machine learning models - Random Forest, Long Short-Term Memory (LSTM), and an ensemble learning approach - to predict CO<sub>2</sub> and CH<sub>4</sub> concentrations without relying on additional GHG monitoring equipment. Among these, the ensemble learning model outperformed the individual models, consistently achieving lower error metrics, even in data-limited scenarios. Feature importance analysis identifies NO<sub>2</sub>, CO, O<sub>3</sub>, and temperature as key predictors of CO<sub>2</sub> and CH<sub>4</sub> level variations, highlighting the influence of combustion-related pollutants and photochemical processes. Cross-validation results confirm the model’s out-of-sample capabilities; however, local factors, such as traffic density, industrial activities, and meteorology, can affect performance. Consequently, model retraining or transfer learning may be required when applying the model to new locations with comparable emission profiles or atmospheric conditions. These findings emphasize the importance of localized context in model application while also demonstrating the broader applicability of the approach. By utilizing data already available through urban monitoring networks, this study offers a scalable and cost-effective strategy to support high-resolution GHG monitoring and inform targeted climate policies in complex urban-industrial regions.","PeriodicalId":311,"journal":{"name":"Environmental Pollution","volume":"13 1","pages":""},"PeriodicalIF":7.6000,"publicationDate":"2025-05-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Environmental Pollution","FirstCategoryId":"93","ListUrlMain":"https://doi.org/10.1016/j.envpol.2025.126362","RegionNum":2,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning has the potential to support the growing need for high-resolution greenhouse gas monitoring in urban and industrial environments, where deploying extensive sensor networks is often limited by cost and operational challenges. This study presents a novel approach for estimating greenhouse gas (GHG) concentrations using routinely collected air quality and meteorological data from existing monitoring stations. Focusing on the Seoul metropolitan area, we developed and evaluated three machine learning models - Random Forest, Long Short-Term Memory (LSTM), and an ensemble learning approach - to predict CO2 and CH4 concentrations without relying on additional GHG monitoring equipment. Among these, the ensemble learning model outperformed the individual models, consistently achieving lower error metrics, even in data-limited scenarios. Feature importance analysis identifies NO2, CO, O3, and temperature as key predictors of CO2 and CH4 level variations, highlighting the influence of combustion-related pollutants and photochemical processes. Cross-validation results confirm the model’s out-of-sample capabilities; however, local factors, such as traffic density, industrial activities, and meteorology, can affect performance. Consequently, model retraining or transfer learning may be required when applying the model to new locations with comparable emission profiles or atmospheric conditions. These findings emphasize the importance of localized context in model application while also demonstrating the broader applicability of the approach. By utilizing data already available through urban monitoring networks, this study offers a scalable and cost-effective strategy to support high-resolution GHG monitoring and inform targeted climate policies in complex urban-industrial regions.

Abstract Image

基于机器学习的首尔都市圈环境CO2和CH4浓度高时间分辨率预测
机器学习有潜力支持城市和工业环境中对高分辨率温室气体监测日益增长的需求,在这些环境中,部署广泛的传感器网络通常受到成本和运营挑战的限制。本研究提出了一种利用现有监测站常规收集的空气质量和气象数据估算温室气体(GHG)浓度的新方法。以首尔市区为重点,我们开发并评估了三种机器学习模型——随机森林、长短期记忆(LSTM)和集成学习方法——来预测二氧化碳和甲烷浓度,而不依赖于额外的温室气体监测设备。其中,集成学习模型优于单个模型,即使在数据有限的情况下也始终实现较低的误差度量。特征重要性分析确定NO2、CO、O3和温度是CO2和CH4水平变化的关键预测因子,突出了与燃烧相关的污染物和光化学过程的影响。交叉验证结果证实了模型的样本外能力;但是,交通密度、工业活动和气象等本地因素会影响性能。因此,在将模型应用于具有可比排放剖面或大气条件的新地点时,可能需要对模型进行再训练或迁移学习。这些发现强调了局部环境在模型应用中的重要性,同时也证明了该方法的更广泛的适用性。通过利用城市监测网络已有的数据,本研究提供了一种可扩展且具有成本效益的策略,以支持高分辨率温室气体监测,并为复杂的城市-工业区提供有针对性的气候政策信息。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Environmental Pollution
Environmental Pollution 环境科学-环境科学
CiteScore
16.00
自引率
6.70%
发文量
2082
审稿时长
2.9 months
期刊介绍: Environmental Pollution is an international peer-reviewed journal that publishes high-quality research papers and review articles covering all aspects of environmental pollution and its impacts on ecosystems and human health. Subject areas include, but are not limited to: • Sources and occurrences of pollutants that are clearly defined and measured in environmental compartments, food and food-related items, and human bodies; • Interlinks between contaminant exposure and biological, ecological, and human health effects, including those of climate change; • Contaminants of emerging concerns (including but not limited to antibiotic resistant microorganisms or genes, microplastics/nanoplastics, electronic wastes, light, and noise) and/or their biological, ecological, or human health effects; • Laboratory and field studies on the remediation/mitigation of environmental pollution via new techniques and with clear links to biological, ecological, or human health effects; • Modeling of pollution processes, patterns, or trends that is of clear environmental and/or human health interest; • New techniques that measure and examine environmental occurrences, transport, behavior, and effects of pollutants within the environment or the laboratory, provided that they can be clearly used to address problems within regional or global environmental compartments.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信