Cyril Voyant , Milan Despotovic , Luis Garcia-Gutierrez , Rodrigo Amaro e Silva , Philippe Lauret , Ted Soubdhan , Nadjem Bailek
{"title":"NICE k metrics: Unified and multidimensional framework for evaluating deterministic solar forecasting accuracy","authors":"Cyril Voyant , Milan Despotovic , Luis Garcia-Gutierrez , Rodrigo Amaro e Silva , Philippe Lauret , Ted Soubdhan , Nadjem Bailek","doi":"10.1016/j.seta.2025.104588","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate solar energy output prediction is key to grid stability and efficient energy management. However, conventional error metrics (such as Root Mean Squared Error (<span><math><mstyle><mi>R</mi><mi>M</mi><mi>S</mi><mi>E</mi></mstyle></math></span>), Mean Absolute Error (<span><math><mstyle><mi>M</mi><mi>A</mi><mi>E</mi></mstyle></math></span>), coefficient of determination (<span><math><msup><mrow><mstyle><mi>R</mi></mstyle></mrow><mrow><mn>2</mn></mrow></msup></math></span>) and Skill Scores (<span><math><mstyle><mi>S</mi><mi>S</mi></mstyle></math></span>)) fail to capture the multidimensional complexity of solar irradiance forecasting. They lack forecastability sensitivity, depend on arbitrary baselines (<em>e.g.</em>, clear-sky models) or adapt poorly to operational needs. To address these limitations, this study introduces <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>k</mi></mrow></msup></mstyle></math></span> (Normalized Informed Comparison of Errors, <span><math><mrow><mi>k</mi><mo>=</mo><mn>1</mn><mo>,</mo><mn>2</mn><mo>,</mo><mn>3</mn><mo>,</mo><mi>Σ</mi></mrow></math></span>), a robust, flexible, and multidimensional evaluation framework. Each score is tied to an <span><math><mstyle><msup><mrow><mi>L</mi></mrow><mrow><mi>k</mi></mrow></msup></mstyle></math></span> norm: <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>1</mi></mrow></msup></mstyle></math></span> targets average errors, <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>2</mi></mrow></msup></mstyle></math></span> emphasizes large deviations, <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>3</mi></mrow></msup></mstyle></math></span> amplifies outliers, and <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>Σ</mi></mrow></msup></mstyle></math></span> aggregates all contributions. Validation relied on synthetic <span>Monte Carlo</span> trials and real data from Spain <span>SIAR</span> network (68 stations across diverse climates). Benchmark models include autoregressive methods, Extreme Learning, and smart persistence, covering both linear and machine learning strategies. Theoretical <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><mi>E</mi></mstyle></math></span> metrics matched empirical values only under strict assumptions (<span><math><mrow><msup><mrow><mstyle><mi>R</mi></mstyle></mrow><mrow><mn>2</mn></mrow></msup><mo>∼</mo><mn>1</mn><mo>.</mo><mn>0</mn></mrow></math></span> for <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>2</mi></mrow></msup></mstyle></math></span>). In contrast, <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>Σ</mi></mrow></msup></mstyle></math></span> consistently outperformed conventional metrics in discriminative power (<span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>05</mn></mrow></math></span> vs <span><math><mrow><mi>p</mi><mo>></mo><mn>0</mn><mo>.</mo><mn>05</mn></mrow></math></span> for <span><math><mstyle><mi>n</mi><mi>R</mi><mi>M</mi><mi>S</mi><mi>E</mi></mstyle></math></span>/<span><math><mstyle><mi>n</mi><mi>M</mi><mi>A</mi><mi>E</mi></mstyle></math></span>). Over longer horizons, <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>Σ</mi></mrow></msup></mstyle></math></span> maintained significant <em>p</em>-values (<span><math><mrow><mn>1</mn><msup><mrow><mn>0</mn></mrow><mrow><mo>−</mo><mn>6</mn></mrow></msup></mrow></math></span> to 0.004), unlike <span><math><mstyle><mi>n</mi><mi>R</mi><mi>M</mi><mi>S</mi><mi>E</mi></mstyle></math></span> and <span><math><mstyle><mi>n</mi><mi>M</mi><mi>A</mi><mi>E</mi></mstyle></math></span>. In addition, conventional metrics (including <span><math><mstyle><mi>n</mi><mi>M</mi><mi>B</mi><mi>E</mi></mstyle></math></span> and <span><math><msup><mrow><mstyle><mi>R</mi></mstyle></mrow><mrow><mn>2</mn></mrow></msup></math></span>) failed to distinguish models in pairwise tests, whereas <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>k</mi></mrow></msup></mstyle></math></span> achieved <span><math><mrow><mi>p</mi><mo><</mo><mn>0</mn><mo>.</mo><mn>001</mn></mrow></math></span> with wider, normally distributed values, enhancing comparability. Theoretical and empirical findings confirm the framework’s sensitivity and operational relevance. These results support adopting <span><math><mstyle><mi>N</mi><mi>I</mi><mi>C</mi><msup><mrow><mi>E</mi></mrow><mrow><mi>k</mi></mrow></msup></mstyle></math></span> as a unified, interpretable, and robust alternative to conventional metrics for deterministic solar forecasting.</div></div>","PeriodicalId":56019,"journal":{"name":"Sustainable Energy Technologies and Assessments","volume":"83 ","pages":"Article 104588"},"PeriodicalIF":7.0000,"publicationDate":"2025-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Energy Technologies and Assessments","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2213138825004199","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate solar energy output prediction is key to grid stability and efficient energy management. However, conventional error metrics (such as Root Mean Squared Error (), Mean Absolute Error (), coefficient of determination () and Skill Scores ()) fail to capture the multidimensional complexity of solar irradiance forecasting. They lack forecastability sensitivity, depend on arbitrary baselines (e.g., clear-sky models) or adapt poorly to operational needs. To address these limitations, this study introduces (Normalized Informed Comparison of Errors, ), a robust, flexible, and multidimensional evaluation framework. Each score is tied to an norm: targets average errors, emphasizes large deviations, amplifies outliers, and aggregates all contributions. Validation relied on synthetic Monte Carlo trials and real data from Spain SIAR network (68 stations across diverse climates). Benchmark models include autoregressive methods, Extreme Learning, and smart persistence, covering both linear and machine learning strategies. Theoretical metrics matched empirical values only under strict assumptions ( for ). In contrast, consistently outperformed conventional metrics in discriminative power ( vs for /). Over longer horizons, maintained significant p-values ( to 0.004), unlike and . In addition, conventional metrics (including and ) failed to distinguish models in pairwise tests, whereas achieved with wider, normally distributed values, enhancing comparability. Theoretical and empirical findings confirm the framework’s sensitivity and operational relevance. These results support adopting as a unified, interpretable, and robust alternative to conventional metrics for deterministic solar forecasting.
准确的太阳能输出预测是电网稳定和高效能源管理的关键。然而,传统的误差指标(如均方根误差(RMSE)、平均绝对误差(MAE)、决定系数(R2)和技能分数(SS))无法捕捉太阳辐照度预测的多维复杂性。它们缺乏可预测性的敏感性,依赖于任意的基线(例如,晴空模型),或者难以适应操作需要。为了解决这些限制,本研究引入了NICEk(归一化信息误差比较,k=1,2,3,Σ),这是一个稳健、灵活和多维的评估框架。每个分数都与Lk规范相关联:NICE1针对平均误差,NICE2强调大偏差,NICE3放大异常值,NICEΣ汇总所有贡献。验证依赖于合成蒙特卡洛试验和来自西班牙SIAR网络(跨越不同气候的68个站点)的真实数据。基准模型包括自回归方法、极限学习和智能持久性,涵盖线性和机器学习策略。理论NICE指标只有在严格的假设下才与经验值匹配(nic2的R2 ~ 1.0)。相比之下,NICEΣ在判别能力上始终优于传统指标(nRMSE/nMAE的p<;0.05 vs p>0.05)。与nRMSE和nMAE不同,NICEΣ在较长时间内保持显著的p值(10−6至0.004)。此外,传统指标(包括nMBE和R2)无法在两两检验中区分模型,而NICEk在更宽的正态分布值下实现了p<;0.001,增强了可比性。理论和实证研究结果证实了该框架的敏感性和操作性。这些结果支持采用NICEk作为一种统一的、可解释的、可靠的替代常规指标进行确定性太阳预报。
期刊介绍:
Encouraging a transition to a sustainable energy future is imperative for our world. Technologies that enable this shift in various sectors like transportation, heating, and power systems are of utmost importance. Sustainable Energy Technologies and Assessments welcomes papers focusing on a range of aspects and levels of technological advancements in energy generation and utilization. The aim is to reduce the negative environmental impact associated with energy production and consumption, spanning from laboratory experiments to real-world applications in the commercial sector.