Continuous Distributions and Measures of Statistical Accuracy for Structured Expert Judgment

Guus Rongen, Gabriela F. Nane, Oswaldo Morales-Napoles, Roger M. Cooke
{"title":"Continuous Distributions and Measures of Statistical Accuracy for Structured Expert Judgment","authors":"Guus Rongen,&nbsp;Gabriela F. Nane,&nbsp;Oswaldo Morales-Napoles,&nbsp;Roger M. Cooke","doi":"10.1002/ffo2.70009","DOIUrl":null,"url":null,"abstract":"<p>This study evaluates five scoring rules, or measures of statistical accuracy, for assessing uncertainty estimates from expert judgment studies and model forecasts. These rules — the Continuously Ranked Probability Score (<span></span><math>\n <semantics>\n <mrow>\n <mi>CRPS</mi>\n </mrow>\n <annotation> ${CRPS}$</annotation>\n </semantics></math>), Kolmogorov-Smirnov (<span></span><math>\n <semantics>\n <mrow>\n <mi>KS</mi>\n </mrow>\n <annotation> ${KS}$</annotation>\n </semantics></math>), Cramer-von Mises (<span></span><math>\n <semantics>\n <mrow>\n <mi>CvM</mi>\n </mrow>\n <annotation> ${CvM}$</annotation>\n </semantics></math>), Anderson Darling (<span></span><math>\n <semantics>\n <mrow>\n <mi>AD</mi>\n </mrow>\n <annotation> ${AD}$</annotation>\n </semantics></math>), and chi-square test — were applied to 6864 expert uncertainty estimates from 49 Classical Model (CM) studies. We compared their sensitivity to various biases and their ability to serve as performance-based weight for expert estimates. Additionally, the piecewise uniform and Metalog distribution were evaluated for their representation of expert estimates because four of the five rules require interpolating the experts' estimates. Simulating biased estimates reveals varying sensitivity of the considered test statistics to these biases. Expert weights derived using one measure of statistical accuracy were evaluated with other measures to assess their performance. The main conclusions are (1) <span></span><math>\n <semantics>\n <mrow>\n <mi>CRPS</mi>\n </mrow>\n <annotation> ${CRPS}$</annotation>\n </semantics></math> overlooks important biases, while chi-square and <span></span><math>\n <semantics>\n <mrow>\n <mi>AD</mi>\n </mrow>\n <annotation> ${AD}$</annotation>\n </semantics></math> behave similarly, as do <span></span><math>\n <semantics>\n <mrow>\n <mi>KS</mi>\n </mrow>\n <annotation> ${KS}$</annotation>\n </semantics></math> and <span></span><math>\n <semantics>\n <mrow>\n <mi>CvM</mi>\n </mrow>\n <annotation> ${CvM}$</annotation>\n </semantics></math>. (2) All measures except <span></span><math>\n <semantics>\n <mrow>\n <mi>CRPS</mi>\n </mrow>\n <annotation> ${CRPS}$</annotation>\n </semantics></math> agree that performance weighting is superior to equal weighting with respect to statistical accuracy. (3) Neither distributions can effectively predict the position of a removed quantile estimate. These insights show the behavior of different scoring rules for combining uncertainty estimates from expert or models, and extent the knowledge for best-practices.</p>","PeriodicalId":100567,"journal":{"name":"FUTURES & FORESIGHT SCIENCE","volume":"7 2","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ffo2.70009","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"FUTURES & FORESIGHT SCIENCE","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/ffo2.70009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

This study evaluates five scoring rules, or measures of statistical accuracy, for assessing uncertainty estimates from expert judgment studies and model forecasts. These rules — the Continuously Ranked Probability Score ( CRPS ${CRPS}$ ), Kolmogorov-Smirnov ( KS ${KS}$ ), Cramer-von Mises ( CvM ${CvM}$ ), Anderson Darling ( AD ${AD}$ ), and chi-square test — were applied to 6864 expert uncertainty estimates from 49 Classical Model (CM) studies. We compared their sensitivity to various biases and their ability to serve as performance-based weight for expert estimates. Additionally, the piecewise uniform and Metalog distribution were evaluated for their representation of expert estimates because four of the five rules require interpolating the experts' estimates. Simulating biased estimates reveals varying sensitivity of the considered test statistics to these biases. Expert weights derived using one measure of statistical accuracy were evaluated with other measures to assess their performance. The main conclusions are (1) CRPS ${CRPS}$ overlooks important biases, while chi-square and AD ${AD}$ behave similarly, as do KS ${KS}$ and CvM ${CvM}$ . (2) All measures except CRPS ${CRPS}$ agree that performance weighting is superior to equal weighting with respect to statistical accuracy. (3) Neither distributions can effectively predict the position of a removed quantile estimate. These insights show the behavior of different scoring rules for combining uncertainty estimates from expert or models, and extent the knowledge for best-practices.

Abstract Image

结构化专家判断的连续分布和统计准确性度量
本研究评估了五种评分规则,或统计准确性措施,用于评估专家判断研究和模型预测的不确定性估计。这些规则——连续排序概率评分(CRPS ${CRPS}$), Kolmogorov-Smirnov (KS ${KS}$), Cramer-von Mises (CvM ${CvM}$),采用Anderson Darling (AD ${AD}$)和卡方检验-对49个经典模型(CM)研究的6864个专家不确定性估计进行了分析。我们比较了它们对各种偏差的敏感性,以及它们作为专家估计的基于性能的权重的能力。此外,由于五个规则中有四个规则需要插值专家的估计,因此评估了分段均匀分布和Metalog分布对专家估计的表示。模拟有偏估计揭示了考虑的测试统计量对这些偏差的不同敏感性。使用一种统计准确性度量得出的专家权重与其他度量一起评估其性能。主要结论是:(1)CRPS ${CRPS}$忽略了重要的偏差,而卡方和AD ${AD}$的行为相似;KS ${KS}$和CvM ${CvM}$也是如此。(2)除CRPS ${CRPS}$外的所有度量均同意,在统计准确性方面,绩效加权优于相等加权。(3)两种分布都不能有效地预测去除的分位数估计的位置。这些见解显示了不同的评分规则的行为,以结合专家或模型的不确定性估计,并扩展了最佳实践的知识。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.00
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信