A Framework for Evaluating Geomagnetic Indices Forecasting Models

IF 3.7 2区地球科学

Space Weather Pub Date : 2024-03-20 DOI:10.1029/2024sw003868

Armando Collado-Villaverde, Pablo Muñoz, Consuelo Cid

{"title":"A Framework for Evaluating Geomagnetic Indices Forecasting Models","authors":"Armando Collado-Villaverde, Pablo Muñoz, Consuelo Cid","doi":"10.1029/2024sw003868","DOIUrl":null,"url":null,"abstract":"The use of Deep Learning models to forecast geomagnetic storms is achieving great results. However, the evaluation of these models is mainly supported on generic regression metrics (such as the Root Mean Squared Error or the Coefficient of Determination), which are not able to properly capture the specific particularities of geomagnetic storms forecasting. Particularly, they do not provide insights during the high activity periods. To overcome this issue, we introduce the Binned Forecasting Error to provide a more accurate assessment of models' performance across the different intensity levels of a geomagnetic storm. This metric facilitates a robust comparison of different forecasting models, presenting a true representation of a model's predictive capabilities while being resilient to different storms duration. In this direction, for enabling fair comparison among models, it is important to standardize the sets of geomagnetic storms for model training, validation and testing. To do this, we have started from the current sets used in the literature for forecasting the SYM-H, enriching them with newer storms not considered previously, focusing not only on disturbances caused by Coronal Mass Ejections but also addressing High-Speed Streams. To operationalize the evaluation framework, a comparative study is conducted between a baseline neural network model and a persistence model, showcasing the effectiveness of the new metric in evaluating forecasting performance during intense geomagnetic storms. Finally, we propose the use of preliminary measurements from ACE to evaluate the model performance in settings closer to an operational real-time scenario, where the forecasting models are expected to operate.","PeriodicalId":22181,"journal":{"name":"Space Weather","volume":"18 1","pages":""},"PeriodicalIF":3.7000,"publicationDate":"2024-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Space Weather","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1029/2024sw003868","RegionNum":2,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The use of Deep Learning models to forecast geomagnetic storms is achieving great results. However, the evaluation of these models is mainly supported on generic regression metrics (such as the Root Mean Squared Error or the Coefficient of Determination), which are not able to properly capture the specific particularities of geomagnetic storms forecasting. Particularly, they do not provide insights during the high activity periods. To overcome this issue, we introduce the Binned Forecasting Error to provide a more accurate assessment of models' performance across the different intensity levels of a geomagnetic storm. This metric facilitates a robust comparison of different forecasting models, presenting a true representation of a model's predictive capabilities while being resilient to different storms duration. In this direction, for enabling fair comparison among models, it is important to standardize the sets of geomagnetic storms for model training, validation and testing. To do this, we have started from the current sets used in the literature for forecasting the SYM-H, enriching them with newer storms not considered previously, focusing not only on disturbances caused by Coronal Mass Ejections but also addressing High-Speed Streams. To operationalize the evaluation framework, a comparative study is conducted between a baseline neural network model and a persistence model, showcasing the effectiveness of the new metric in evaluating forecasting performance during intense geomagnetic storms. Finally, we propose the use of preliminary measurements from ACE to evaluate the model performance in settings closer to an operational real-time scenario, where the forecasting models are expected to operate.

查看原文本刊更多论文

地磁指数预测模型评估框架

使用深度学习模型预报地磁暴正在取得巨大成果。然而，对这些模型的评估主要基于通用回归指标（如均方根误差或判定系数），无法正确捕捉地磁暴预报的特殊性。尤其是在高活动期，它们无法提供深入的见解。为了解决这个问题，我们引入了分档预报误差，以便更准确地评估模型在不同强度的地磁暴中的表现。这一指标有助于对不同的预测模型进行稳健的比较，真实地反映模型的预测能力，同时对不同的风暴持续时间具有弹性。为此，为了公平地比较不同模型，必须对用于模型训练、验证和测试的地磁暴集进行标准化。为此，我们从目前用于预测 SYM-H 的文献集入手，用以前未考虑过的较新风暴来充实这些文献集，不仅关注日冕物质抛射引起的扰动，还关注高速流。为了使评估框架可操作化，我们在基线神经网络模型和持久性模型之间进行了比较研究，展示了新指标在评估强地磁风暴期间预报性能方面的有效性。最后，我们建议使用来自 ACE 的初步测量结果来评估模型在更接近实时运行场景下的性能，即预报模型预计运行的场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Space Weather

自引率

29.70%

发文量

166