通过混合模型预测对流事件:WRF和机器学习算法

IF 2.6 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Applied Computing and Geosciences Pub Date : 2022-12-01 DOI:10.1016/j.acags.2022.100099

Yasmin Uchôa da Silva , Gutemberg Borges França , Heloisa Musetti Ruivo , Haroldo Fraga de Campos Velho

{"title":"通过混合模型预测对流事件:WRF和机器学习算法","authors":"Yasmin Uchôa da Silva , Gutemberg Borges França , Heloisa Musetti Ruivo , Haroldo Fraga de Campos Velho","doi":"10.1016/j.acags.2022.100099","DOIUrl":null,"url":null,"abstract":"<div><p>This presents a novel hybrid 24-h forecasting model of convective weather events based on numerical simulation and machine learning algorithms. To characterize the convective events, 13-year from 2008 up to 2020 of precipitation data from the main airport stations in Rio de Janeiro, Brazil, and atmospheric discharges from the surrounding area of around 150 km are investigated. The Weather Research and Forecasting (WRF) model was used to numerically simulate atmospheric conditions for every day in February, as it is the month with the greatest daily rate of atmospheric discharge for the data period. The p-value hypothesis test (with <span><math><mrow><mi>α</mi><mspace></mspace><mo>=</mo><mspace></mspace><mn>0.05</mn></mrow></math></span>) was applied to each grid point of the numerically predicted variables (defined as an independent attribute) to find those most associated with convective events using the output of the 3-D WRF grid. This one identified 36 attributes (or predictors) that were used as input in the machine learning algorithms' training-test process in this study. Several cross-validation training and testing experiments were carried out using the nine-selected categorical machine learning algorithms and the 36 defined predictors. After applying the boosting technique to the nine previously trained-tested algorithms, the results of the 24-h predictions of convective occurrences were deemed satisfactory. The RandomForest method produced the best results, with statistics values close to perfection, such as POD = 1.00, FAR = 0.02, and CSI = 0.98. The 24-h hindcast utilizing the nine algorithms for the 28 days of February 2019 was very encouraging because it was able to almost recreate the maturation phase of events and their eventual failures were noted during the formation and dissipation phases. The best and worst 24-h hindcast had POD = 0.97 and 0.88, FAR = 0.02 and 0.12, and CSI = 0.94 and 0.78, respectively.</p></div>","PeriodicalId":33804,"journal":{"name":"Applied Computing and Geosciences","volume":"16 ","pages":"Article 100099"},"PeriodicalIF":2.6000,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2590197422000210/pdfft?md5=1d2ce0355dabd75829d084b9b8de2eaf&pid=1-s2.0-S2590197422000210-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Forecast of convective events via hybrid model: WRF and machine learning algorithms\",\"authors\":\"Yasmin Uchôa da Silva , Gutemberg Borges França , Heloisa Musetti Ruivo , Haroldo Fraga de Campos Velho\",\"doi\":\"10.1016/j.acags.2022.100099\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>This presents a novel hybrid 24-h forecasting model of convective weather events based on numerical simulation and machine learning algorithms. To characterize the convective events, 13-year from 2008 up to 2020 of precipitation data from the main airport stations in Rio de Janeiro, Brazil, and atmospheric discharges from the surrounding area of around 150 km are investigated. The Weather Research and Forecasting (WRF) model was used to numerically simulate atmospheric conditions for every day in February, as it is the month with the greatest daily rate of atmospheric discharge for the data period. The p-value hypothesis test (with <span><math><mrow><mi>α</mi><mspace></mspace><mo>=</mo><mspace></mspace><mn>0.05</mn></mrow></math></span>) was applied to each grid point of the numerically predicted variables (defined as an independent attribute) to find those most associated with convective events using the output of the 3-D WRF grid. This one identified 36 attributes (or predictors) that were used as input in the machine learning algorithms' training-test process in this study. Several cross-validation training and testing experiments were carried out using the nine-selected categorical machine learning algorithms and the 36 defined predictors. After applying the boosting technique to the nine previously trained-tested algorithms, the results of the 24-h predictions of convective occurrences were deemed satisfactory. The RandomForest method produced the best results, with statistics values close to perfection, such as POD = 1.00, FAR = 0.02, and CSI = 0.98. The 24-h hindcast utilizing the nine algorithms for the 28 days of February 2019 was very encouraging because it was able to almost recreate the maturation phase of events and their eventual failures were noted during the formation and dissipation phases. The best and worst 24-h hindcast had POD = 0.97 and 0.88, FAR = 0.02 and 0.12, and CSI = 0.94 and 0.78, respectively.</p></div>\",\"PeriodicalId\":33804,\"journal\":{\"name\":\"Applied Computing and Geosciences\",\"volume\":\"16 \",\"pages\":\"Article 100099\"},\"PeriodicalIF\":2.6000,\"publicationDate\":\"2022-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2590197422000210/pdfft?md5=1d2ce0355dabd75829d084b9b8de2eaf&pid=1-s2.0-S2590197422000210-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Applied Computing and Geosciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2590197422000210\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Computing and Geosciences","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2590197422000210","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

摘要

提出了一种基于数值模拟和机器学习算法的对流天气事件24小时混合预报模型。本文利用巴西里约热内卢主要机场站2008 ~ 2020年的13年降水资料和周边约150公里的大气排放资料，对对流事件进行了特征分析。由于2月是数据期内日大气排放量最大的月份，因此采用天气研究与预报(WRF)模式对2月每天的大气状况进行了数值模拟。利用三维WRF网格的输出，对数值预测变量(定义为独立属性)的每个网格点进行p值假设检验(α=0.05)，以找出与对流事件最相关的变量。这篇文章确定了36个属性(或预测因子)，这些属性(或预测因子)被用作本研究中机器学习算法训练测试过程的输入。使用9种选择的分类机器学习算法和36个定义的预测因子进行了多次交叉验证训练和测试实验。在将增强技术应用于九个先前经过训练和测试的算法之后，对对流发生的24小时预测结果被认为是令人满意的。RandomForest方法的结果最好，统计值接近完美，如POD = 1.00, FAR = 0.02, CSI = 0.98。使用9种算法对2019年2月28天进行的24小时后发预报非常令人鼓舞，因为它几乎能够重现事件的成熟阶段，并且在形成和消散阶段注意到它们的最终失败。最佳和最差的24 h后验POD分别为0.97和0.88,FAR为0.02和0.12,CSI为0.94和0.78。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Forecast of convective events via hybrid model: WRF and machine learning algorithms

This presents a novel hybrid 24-h forecasting model of convective weather events based on numerical simulation and machine learning algorithms. To characterize the convective events, 13-year from 2008 up to 2020 of precipitation data from the main airport stations in Rio de Janeiro, Brazil, and atmospheric discharges from the surrounding area of around 150 km are investigated. The Weather Research and Forecasting (WRF) model was used to numerically simulate atmospheric conditions for every day in February, as it is the month with the greatest daily rate of atmospheric discharge for the data period. The p-value hypothesis test (with $α = 0.05$ ) was applied to each grid point of the numerically predicted variables (defined as an independent attribute) to find those most associated with convective events using the output of the 3-D WRF grid. This one identified 36 attributes (or predictors) that were used as input in the machine learning algorithms' training-test process in this study. Several cross-validation training and testing experiments were carried out using the nine-selected categorical machine learning algorithms and the 36 defined predictors. After applying the boosting technique to the nine previously trained-tested algorithms, the results of the 24-h predictions of convective occurrences were deemed satisfactory. The RandomForest method produced the best results, with statistics values close to perfection, such as POD = 1.00, FAR = 0.02, and CSI = 0.98. The 24-h hindcast utilizing the nine algorithms for the 28 days of February 2019 was very encouraging because it was able to almost recreate the maturation phase of events and their eventual failures were noted during the formation and dissipation phases. The best and worst 24-h hindcast had POD = 0.97 and 0.88, FAR = 0.02 and 0.12, and CSI = 0.94 and 0.78, respectively.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Applied Computing and Geosciences Computer Science-General Computer Science

CiteScore

5.50

自引率

0.00%

发文量

审稿时长

5 weeks