An Ensemble Machine Learning Model to Enhance Extrapolation Ability of Predicting Coarse Particulate Matter with High Resolutions in China

IF 11.3 1区 环境科学与生态学 Q1 ENGINEERING, ENVIRONMENTAL
Su Shi, Renjie Chen, Peng Wang, Hongliang Zhang, Haidong Kan and Xia Meng*, 
{"title":"An Ensemble Machine Learning Model to Enhance Extrapolation Ability of Predicting Coarse Particulate Matter with High Resolutions in China","authors":"Su Shi,&nbsp;Renjie Chen,&nbsp;Peng Wang,&nbsp;Hongliang Zhang,&nbsp;Haidong Kan and Xia Meng*,&nbsp;","doi":"10.1021/acs.est.4c0861010.1021/acs.est.4c08610","DOIUrl":null,"url":null,"abstract":"<p >Accurate exposure assessment is important for conducting PM<sub>10-2.5</sub>-related epidemiological studies, which have been limited thus far. In this study, we aimed to develop an ensemble machine learning method to estimate PM<sub>10-2.5</sub> concentrations in mainland China during 2013–2020. The study was conducted in two stages. In the first stage, we developed two methods: the indirect method refers to developing models for PM<sub>2.5</sub> and PM<sub>10</sub> separately and subsequently calculating PM<sub>10-2.5</sub> as the difference between them; and the direct method refers to establishing a model between PM<sub>10-2.5</sub> measurements and relevant predictors directly. In the second stage, we employed an ensemble method by integrating predictions from both indirect and direct methods. Internal and external cross-validation (CV) were performed to validate the extrapolation capacity of models. The ensemble method demonstrated enhanced extrapolation accuracy in both internal and external CV compared to indirect and direct methods. The predictions produced by the ensemble method captured the spatiotemporal pattern of PM<sub>10-2.5</sub>, even in the sand and dust storm seasons. Our study introduces an ensemble strategy leveraging the strengths of both indirect and direct methods to estimate PM<sub>10-2.5</sub> concentrations, which holds significant potential to support future epidemiological studies to address knowledge gaps in understanding the health effects of PM<sub>10-2.5</sub>.</p>","PeriodicalId":36,"journal":{"name":"环境科学与技术","volume":"58 43","pages":"19325–19337 19325–19337"},"PeriodicalIF":11.3000,"publicationDate":"2024-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"环境科学与技术","FirstCategoryId":"1","ListUrlMain":"https://pubs.acs.org/doi/10.1021/acs.est.4c08610","RegionNum":1,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ENVIRONMENTAL","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate exposure assessment is important for conducting PM10-2.5-related epidemiological studies, which have been limited thus far. In this study, we aimed to develop an ensemble machine learning method to estimate PM10-2.5 concentrations in mainland China during 2013–2020. The study was conducted in two stages. In the first stage, we developed two methods: the indirect method refers to developing models for PM2.5 and PM10 separately and subsequently calculating PM10-2.5 as the difference between them; and the direct method refers to establishing a model between PM10-2.5 measurements and relevant predictors directly. In the second stage, we employed an ensemble method by integrating predictions from both indirect and direct methods. Internal and external cross-validation (CV) were performed to validate the extrapolation capacity of models. The ensemble method demonstrated enhanced extrapolation accuracy in both internal and external CV compared to indirect and direct methods. The predictions produced by the ensemble method captured the spatiotemporal pattern of PM10-2.5, even in the sand and dust storm seasons. Our study introduces an ensemble strategy leveraging the strengths of both indirect and direct methods to estimate PM10-2.5 concentrations, which holds significant potential to support future epidemiological studies to address knowledge gaps in understanding the health effects of PM10-2.5.

Abstract Image

增强中国高分辨率粗颗粒物预测外推能力的集合机器学习模型
准确的暴露评估对于开展与 PM10-2.5 相关的流行病学研究非常重要,但迄今为止,这方面的研究还很有限。在本研究中,我们旨在开发一种集合机器学习方法,以估算 2013-2020 年期间中国大陆的 PM10-2.5 浓度。研究分两个阶段进行。在第一阶段,我们开发了两种方法:间接法是指分别建立PM2.5和PM10模型,然后将两者之差作为PM10-2.5进行计算;直接法是指直接在PM10-2.5测量值和相关预测因子之间建立模型。在第二阶段,我们采用了一种集合方法,综合了间接和直接方法的预测结果。为了验证模型的外推能力,我们进行了内部和外部交叉验证(CV)。与间接和直接方法相比,集合方法在内部和外部交叉验证中都显示出更高的外推精度。集合方法产生的预测结果捕捉到了 PM10-2.5 的时空模式,即使在沙尘暴季节也是如此。我们的研究引入了一种集合策略,利用间接方法和直接方法的优势来估算 PM10-2.5 的浓度,该策略在支持未来的流行病学研究方面具有巨大潜力,可弥补在了解 PM10-2.5 对健康的影响方面存在的知识差距。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
环境科学与技术
环境科学与技术 环境科学-工程:环境
CiteScore
17.50
自引率
9.60%
发文量
12359
审稿时长
2.8 months
期刊介绍: Environmental Science & Technology (ES&T) is a co-sponsored academic and technical magazine by the Hubei Provincial Environmental Protection Bureau and the Hubei Provincial Academy of Environmental Sciences. Environmental Science & Technology (ES&T) holds the status of Chinese core journals, scientific papers source journals of China, Chinese Science Citation Database source journals, and Chinese Academic Journal Comprehensive Evaluation Database source journals. This publication focuses on the academic field of environmental protection, featuring articles related to environmental protection and technical advancements.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信