Forecast score distributions with imperfect observations

Q1 Mathematics
J. Bessac, P. Naveau
{"title":"Forecast score distributions with imperfect observations","authors":"J. Bessac, P. Naveau","doi":"10.5194/ascmo-7-53-2021","DOIUrl":null,"url":null,"abstract":"Abstract. The field of statistics has become one of the mathematical foundations in forecast evaluation studies, especially with regard to computing scoring rules. The classical paradigm of scoring rules is to discriminate between two different forecasts by comparing them with observations.\nThe probability distribution of the observed record is assumed to be perfect as a verification benchmark.\nIn practice, however, observations are almost always tainted by errors and uncertainties.\nThese may be due to homogenization problems, instrumental deficiencies, the need for indirect reconstructions from other sources (e.g., radar data), model errors in gridded products like reanalysis, or any other data-recording issues.\nIf the yardstick used to compare forecasts is imprecise, one can wonder whether such types of errors may or may not have a strong influence on decisions based on classical scoring rules.\nWe propose a new scoring rule scheme in the context of models that incorporate errors of the verification data.\nWe rely on existing scoring rules and incorporate uncertainty and error of the verification data through a hidden variable and the conditional expectation of scores when they are viewed as a random variable.\nThe proposed scoring framework is applied to standard setups, mainly an additive Gaussian noise model and a multiplicative Gamma noise model.\nThese classical examples provide known and tractable conditional distributions and, consequently, allow us to interpret explicit expressions of our score.\nBy considering scores to be random variables, one can access the entire range of their distribution. In particular, we illustrate that the commonly used mean score can be a misleading representative of the distribution when the latter is highly skewed or has heavy tails. In a simulation study, through the power of a statistical test, we demonstrate the ability of the newly proposed score to better discriminate between forecasts when verification data are subject to uncertainty compared with the scores used in practice.\nWe apply the benefit of accounting for the uncertainty of the verification data in the scoring procedure on a dataset of surface wind speed from measurements and numerical model outputs. Finally, we open some discussions on the use of this proposed scoring framework for non-explicit conditional distributions.\n","PeriodicalId":36792,"journal":{"name":"Advances in Statistical Climatology, Meteorology and Oceanography","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2018-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advances in Statistical Climatology, Meteorology and Oceanography","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/ascmo-7-53-2021","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 7

Abstract

Abstract. The field of statistics has become one of the mathematical foundations in forecast evaluation studies, especially with regard to computing scoring rules. The classical paradigm of scoring rules is to discriminate between two different forecasts by comparing them with observations. The probability distribution of the observed record is assumed to be perfect as a verification benchmark. In practice, however, observations are almost always tainted by errors and uncertainties. These may be due to homogenization problems, instrumental deficiencies, the need for indirect reconstructions from other sources (e.g., radar data), model errors in gridded products like reanalysis, or any other data-recording issues. If the yardstick used to compare forecasts is imprecise, one can wonder whether such types of errors may or may not have a strong influence on decisions based on classical scoring rules. We propose a new scoring rule scheme in the context of models that incorporate errors of the verification data. We rely on existing scoring rules and incorporate uncertainty and error of the verification data through a hidden variable and the conditional expectation of scores when they are viewed as a random variable. The proposed scoring framework is applied to standard setups, mainly an additive Gaussian noise model and a multiplicative Gamma noise model. These classical examples provide known and tractable conditional distributions and, consequently, allow us to interpret explicit expressions of our score. By considering scores to be random variables, one can access the entire range of their distribution. In particular, we illustrate that the commonly used mean score can be a misleading representative of the distribution when the latter is highly skewed or has heavy tails. In a simulation study, through the power of a statistical test, we demonstrate the ability of the newly proposed score to better discriminate between forecasts when verification data are subject to uncertainty compared with the scores used in practice. We apply the benefit of accounting for the uncertainty of the verification data in the scoring procedure on a dataset of surface wind speed from measurements and numerical model outputs. Finally, we open some discussions on the use of this proposed scoring framework for non-explicit conditional distributions.
用不完全观测预测分数分布
摘要统计学领域已经成为预测评估研究的数学基础之一,尤其是在计算评分规则方面。评分规则的经典范例是通过将两种不同的预测与观测结果进行比较来区分它们。假设观测记录的概率分布是完美的,作为验证基准。然而,在实践中,观察几乎总是受到误差和不确定性的影响。这可能是由于同质化问题、仪器缺陷、需要从其他来源(例如雷达数据)进行间接重建、网格化产品中的模型错误(如再分析)或任何其他数据记录问题。如果用于比较预测的尺度不精确,人们可能会想,这类错误是否会对基于经典评分规则的决策产生强烈影响。我们在模型中提出了一种新的评分规则方案,该方案包含了验证数据的错误。我们依赖于现有的评分规则,并通过隐藏变量和将分数视为随机变量时的条件期望,将验证数据的不确定性和误差纳入其中。所提出的评分框架应用于标准设置,主要是加性高斯噪声模型和乘性伽马噪声模型。这些经典例子提供了已知且易于处理的条件分布,因此,允许我们解释分数的显式表达式。通过将分数视为随机变量,可以访问其分布的整个范围。特别是,我们说明,当后者高度偏斜或尾部较重时,常用的平均分数可能是分布的误导性代表。在一项模拟研究中,通过统计测试的力量,我们证明了与实践中使用的分数相比,当验证数据存在不确定性时,新提出的分数能够更好地区分预测。我们在对来自测量和数值模型输出的表面风速数据集进行评分的过程中,应用了考虑验证数据不确定性的好处。最后,我们开始讨论将该评分框架用于非显式条件分布。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Advances in Statistical Climatology, Meteorology and Oceanography
Advances in Statistical Climatology, Meteorology and Oceanography Earth and Planetary Sciences-Atmospheric Science
CiteScore
4.80
自引率
0.00%
发文量
9
审稿时长
26 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信