使用基于树和随机森林的机器学习算法预测地面能见度:基于大气污染和大气边界层数据的比较研究

IF 3.9 3区 环境科学与生态学 Q2 ENVIRONMENTAL SCIENCES
Fuzeng Wang , Ruolan Liu , Hao Yan , Duanyang Liu , Lin Han , Shujie Yuan
{"title":"使用基于树和随机森林的机器学习算法预测地面能见度:基于大气污染和大气边界层数据的比较研究","authors":"Fuzeng Wang ,&nbsp;Ruolan Liu ,&nbsp;Hao Yan ,&nbsp;Duanyang Liu ,&nbsp;Lin Han ,&nbsp;Shujie Yuan","doi":"10.1016/j.apr.2024.102270","DOIUrl":null,"url":null,"abstract":"<div><p>To mitigate haze impacts, three visibility simulation schemes were designed using decision tree and random forest algorithms, leveraging atmospheric boundary layer meteorological data, pollutant concentrations, and ground observations. The optimal approach was identified to investigate the boundary layer's effect on simulations. The results showed that the simulation effect of the random forest algorithm for two haze processes was better than that of the decision tree algorithm. In the first haze process, the random forest algorithm had a more significant reduction in root mean square error than the decision tree algorithm in the same visibility range (Scheme 3, visibility&lt;200 m, mean absolute error reduced by 5.9%, root mean square error reduced by 19.1%). Simulation models significantly improved the accuracy of the models by adding atmospheric boundary layer observation data to the two fog-hazes process visibility. However, the addition of atmospheric boundary layer meteorological data in the first haze process had a better improvement effect (random forest: visibility&lt;200 m, mean absolute errors of 25.0 (relative error&lt;12.5%) and 25.5 m (relative error&lt;12.8%) in Scheme 2 and 3, respectively). The addition of atmospheric boundary-layer pollutant concentrations data was more effective in the second haze process (random forest: visibility&lt;200 m, scheme 2 and scheme 3 had mean absolute errors of 25.6 (relative error&lt;12.8%) and 11.1 m (relative error&lt;5.6%), respectively). The influence of atmospheric boundary layer meteorological data and pollutant data on the two fog processes is affected by the cause of the fog process.</p></div>","PeriodicalId":8604,"journal":{"name":"Atmospheric Pollution Research","volume":"15 11","pages":"Article 102270"},"PeriodicalIF":3.9000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ground visibility prediction using tree-based and random-forest machine learning algorithm: Comparative study based on atmospheric pollution and atmospheric boundary layer data\",\"authors\":\"Fuzeng Wang ,&nbsp;Ruolan Liu ,&nbsp;Hao Yan ,&nbsp;Duanyang Liu ,&nbsp;Lin Han ,&nbsp;Shujie Yuan\",\"doi\":\"10.1016/j.apr.2024.102270\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>To mitigate haze impacts, three visibility simulation schemes were designed using decision tree and random forest algorithms, leveraging atmospheric boundary layer meteorological data, pollutant concentrations, and ground observations. The optimal approach was identified to investigate the boundary layer's effect on simulations. The results showed that the simulation effect of the random forest algorithm for two haze processes was better than that of the decision tree algorithm. In the first haze process, the random forest algorithm had a more significant reduction in root mean square error than the decision tree algorithm in the same visibility range (Scheme 3, visibility&lt;200 m, mean absolute error reduced by 5.9%, root mean square error reduced by 19.1%). Simulation models significantly improved the accuracy of the models by adding atmospheric boundary layer observation data to the two fog-hazes process visibility. However, the addition of atmospheric boundary layer meteorological data in the first haze process had a better improvement effect (random forest: visibility&lt;200 m, mean absolute errors of 25.0 (relative error&lt;12.5%) and 25.5 m (relative error&lt;12.8%) in Scheme 2 and 3, respectively). The addition of atmospheric boundary-layer pollutant concentrations data was more effective in the second haze process (random forest: visibility&lt;200 m, scheme 2 and scheme 3 had mean absolute errors of 25.6 (relative error&lt;12.8%) and 11.1 m (relative error&lt;5.6%), respectively). The influence of atmospheric boundary layer meteorological data and pollutant data on the two fog processes is affected by the cause of the fog process.</p></div>\",\"PeriodicalId\":8604,\"journal\":{\"name\":\"Atmospheric Pollution Research\",\"volume\":\"15 11\",\"pages\":\"Article 102270\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2024-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Atmospheric Pollution Research\",\"FirstCategoryId\":\"93\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1309104224002356\",\"RegionNum\":3,\"RegionCategory\":\"环境科学与生态学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENVIRONMENTAL SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Atmospheric Pollution Research","FirstCategoryId":"93","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1309104224002356","RegionNum":3,"RegionCategory":"环境科学与生态学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENVIRONMENTAL SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

为减轻雾霾影响,利用决策树和随机森林算法,利用大气边界层气象数据、污染物浓度和地面观测数据,设计了三种能见度模拟方案。确定了最佳方法,以研究边界层对模拟的影响。结果表明,随机森林算法对两个雾霾过程的模拟效果优于决策树算法。在第一个雾霾过程中,在相同能见度范围内,随机森林算法比决策树算法更显著地降低了均方根误差(方案 3,能见度<200 米,平均绝对误差降低了 5.9%,均方根误差降低了 19.1%)。模拟模型通过在两个雾霞过程能见度中加入大气边界层观测数据,大大提高了模型的准确性。但是,在第一次雾霾过程中加入大气边界层气象数据的改善效果更好(随机森林:能见度<200 米,方案 2 和方案 3 的平均绝对误差分别为 25.0 米(相对误差<12.5%)和 25.5 米(相对误差<12.8%))。加入大气边界层污染物浓度数据对第二次灰霾过程更有效(随机森林:能见度<200 米,方案 2 和方案 3 的平均绝对误差分别为 25.6(相对误差<12.8%)和 11.1 米(相对误差<5.6%))。大气边界层气象数据和污染物数据对两次雾过程的影响受雾过程成因的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Ground visibility prediction using tree-based and random-forest machine learning algorithm: Comparative study based on atmospheric pollution and atmospheric boundary layer data

To mitigate haze impacts, three visibility simulation schemes were designed using decision tree and random forest algorithms, leveraging atmospheric boundary layer meteorological data, pollutant concentrations, and ground observations. The optimal approach was identified to investigate the boundary layer's effect on simulations. The results showed that the simulation effect of the random forest algorithm for two haze processes was better than that of the decision tree algorithm. In the first haze process, the random forest algorithm had a more significant reduction in root mean square error than the decision tree algorithm in the same visibility range (Scheme 3, visibility<200 m, mean absolute error reduced by 5.9%, root mean square error reduced by 19.1%). Simulation models significantly improved the accuracy of the models by adding atmospheric boundary layer observation data to the two fog-hazes process visibility. However, the addition of atmospheric boundary layer meteorological data in the first haze process had a better improvement effect (random forest: visibility<200 m, mean absolute errors of 25.0 (relative error<12.5%) and 25.5 m (relative error<12.8%) in Scheme 2 and 3, respectively). The addition of atmospheric boundary-layer pollutant concentrations data was more effective in the second haze process (random forest: visibility<200 m, scheme 2 and scheme 3 had mean absolute errors of 25.6 (relative error<12.8%) and 11.1 m (relative error<5.6%), respectively). The influence of atmospheric boundary layer meteorological data and pollutant data on the two fog processes is affected by the cause of the fog process.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Atmospheric Pollution Research
Atmospheric Pollution Research ENVIRONMENTAL SCIENCES-
CiteScore
8.30
自引率
6.70%
发文量
256
审稿时长
36 days
期刊介绍: Atmospheric Pollution Research (APR) is an international journal designed for the publication of articles on air pollution. Papers should present novel experimental results, theory and modeling of air pollution on local, regional, or global scales. Areas covered are research on inorganic, organic, and persistent organic air pollutants, air quality monitoring, air quality management, atmospheric dispersion and transport, air-surface (soil, water, and vegetation) exchange of pollutants, dry and wet deposition, indoor air quality, exposure assessment, health effects, satellite measurements, natural emissions, atmospheric chemistry, greenhouse gases, and effects on climate change.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信