基于决策树的集成分类器在非测量区域雾频率预测中的性能

IF 3 3区 地球科学 Q2 METEOROLOGY & ATMOSPHERIC SCIENCES
Daeha Kim, Eunhee Kim, Eunji Kim
{"title":"基于决策树的集成分类器在非测量区域雾频率预测中的性能","authors":"Daeha Kim, Eunhee Kim, Eunji Kim","doi":"10.1175/waf-d-23-0024.1","DOIUrl":null,"url":null,"abstract":"Abstract Fog is a phenomenon that exerts significant impacts on transportation, aviation, air quality, agriculture, and even water resources. While data-driven machine learning algorithms have shown promising performance in capturing non-linear fog events at point locations, their applicability to different areas and time periods is questionable. This study addresses this issue by examining five decision-tree-based classifiers in a South Korean region, where diverse fog formation mechanisms are at play. The five machine learning algorithms were trained at point locations, and tested with other point locations for time periods independent of the training processes. Using the ensemble classifiers and high-resolution atmospheric reanalysis data, we also attempted to establish fog occurrence maps in a regional area. Results showed that machine learning models trained on the local datasets exhibited superior performance in mountainous areas, where radiative cooling predominantly contributes to fog formation, compared to inland and coastal regions. As the fog generation mechanisms diversified, the tree-based ensemble models appeared to encounter challenges in delineating their decision boundaries. When they were trained with the reanalysis data, their predictive skills were significantly decreased, resulting in high false alarm rates. This prompted the need for post-processing techniques to rectify overestimated fog frequency. While post-processing may ameliorate overestimation, caution is needed to interpret the resultant fog frequency estimates, especially in regions with more diverse fog generation mechanisms. The spatial upscaling of machine-learning-based fog prediction models poses challenges owing to the intricate interplay of various fog formation mechanisms, data imbalances, and potential inaccuracies in reanalysis data.","PeriodicalId":49369,"journal":{"name":"Weather and Forecasting","volume":"46 1","pages":"0"},"PeriodicalIF":3.0000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Performance of decision-tree-based ensemble classifiers in predicting fog frequency in ungauged areas\",\"authors\":\"Daeha Kim, Eunhee Kim, Eunji Kim\",\"doi\":\"10.1175/waf-d-23-0024.1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract Fog is a phenomenon that exerts significant impacts on transportation, aviation, air quality, agriculture, and even water resources. While data-driven machine learning algorithms have shown promising performance in capturing non-linear fog events at point locations, their applicability to different areas and time periods is questionable. This study addresses this issue by examining five decision-tree-based classifiers in a South Korean region, where diverse fog formation mechanisms are at play. The five machine learning algorithms were trained at point locations, and tested with other point locations for time periods independent of the training processes. Using the ensemble classifiers and high-resolution atmospheric reanalysis data, we also attempted to establish fog occurrence maps in a regional area. Results showed that machine learning models trained on the local datasets exhibited superior performance in mountainous areas, where radiative cooling predominantly contributes to fog formation, compared to inland and coastal regions. As the fog generation mechanisms diversified, the tree-based ensemble models appeared to encounter challenges in delineating their decision boundaries. When they were trained with the reanalysis data, their predictive skills were significantly decreased, resulting in high false alarm rates. This prompted the need for post-processing techniques to rectify overestimated fog frequency. While post-processing may ameliorate overestimation, caution is needed to interpret the resultant fog frequency estimates, especially in regions with more diverse fog generation mechanisms. The spatial upscaling of machine-learning-based fog prediction models poses challenges owing to the intricate interplay of various fog formation mechanisms, data imbalances, and potential inaccuracies in reanalysis data.\",\"PeriodicalId\":49369,\"journal\":{\"name\":\"Weather and Forecasting\",\"volume\":\"46 1\",\"pages\":\"0\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2023-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Weather and Forecasting\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1175/waf-d-23-0024.1\",\"RegionNum\":3,\"RegionCategory\":\"地球科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"METEOROLOGY & ATMOSPHERIC SCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Weather and Forecasting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1175/waf-d-23-0024.1","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"METEOROLOGY & ATMOSPHERIC SCIENCES","Score":null,"Total":0}
引用次数: 0

摘要

雾是一种对交通、航空、空气质量、农业甚至水资源产生重大影响的现象。虽然数据驱动的机器学习算法在捕获点位置的非线性雾事件方面表现出了很好的性能,但它们对不同区域和时间段的适用性值得怀疑。本研究通过检查韩国地区的五个基于决策树的分类器来解决这个问题,在韩国地区,不同的雾形成机制在起作用。这五种机器学习算法在点位置进行训练,并在独立于训练过程的时间段内与其他点位置进行测试。利用集合分类器和高分辨率大气再分析数据,我们还尝试建立了区域内的雾发生图。结果表明,与内陆和沿海地区相比,在本地数据集上训练的机器学习模型在辐射冷却主要导致雾形成的山区表现出优越的性能。随着雾产生机制的多样化,基于树的集成模型在划定决策边界方面遇到了挑战。当他们接受再分析数据训练时,他们的预测能力明显下降,导致误报率很高。这促使需要后处理技术来纠正高估的雾频率。虽然后处理可以改善高估,但需要谨慎解释由此产生的雾频率估计,特别是在雾产生机制更多样化的地区。由于各种雾形成机制的复杂相互作用、数据不平衡以及再分析数据中的潜在不准确性,基于机器学习的雾预测模型的空间升级提出了挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Performance of decision-tree-based ensemble classifiers in predicting fog frequency in ungauged areas
Abstract Fog is a phenomenon that exerts significant impacts on transportation, aviation, air quality, agriculture, and even water resources. While data-driven machine learning algorithms have shown promising performance in capturing non-linear fog events at point locations, their applicability to different areas and time periods is questionable. This study addresses this issue by examining five decision-tree-based classifiers in a South Korean region, where diverse fog formation mechanisms are at play. The five machine learning algorithms were trained at point locations, and tested with other point locations for time periods independent of the training processes. Using the ensemble classifiers and high-resolution atmospheric reanalysis data, we also attempted to establish fog occurrence maps in a regional area. Results showed that machine learning models trained on the local datasets exhibited superior performance in mountainous areas, where radiative cooling predominantly contributes to fog formation, compared to inland and coastal regions. As the fog generation mechanisms diversified, the tree-based ensemble models appeared to encounter challenges in delineating their decision boundaries. When they were trained with the reanalysis data, their predictive skills were significantly decreased, resulting in high false alarm rates. This prompted the need for post-processing techniques to rectify overestimated fog frequency. While post-processing may ameliorate overestimation, caution is needed to interpret the resultant fog frequency estimates, especially in regions with more diverse fog generation mechanisms. The spatial upscaling of machine-learning-based fog prediction models poses challenges owing to the intricate interplay of various fog formation mechanisms, data imbalances, and potential inaccuracies in reanalysis data.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Weather and Forecasting
Weather and Forecasting 地学-气象与大气科学
CiteScore
5.20
自引率
17.20%
发文量
131
审稿时长
6-12 weeks
期刊介绍: Weather and Forecasting (WAF) (ISSN: 0882-8156; eISSN: 1520-0434) publishes research that is relevant to operational forecasting. This includes papers on significant weather events, forecasting techniques, forecast verification, model parameterizations, data assimilation, model ensembles, statistical postprocessing techniques, the transfer of research results to the forecasting community, and the societal use and value of forecasts. The scope of WAF includes research relevant to forecast lead times ranging from short-term “nowcasts” through seasonal time scales out to approximately two years.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信