Integrating physics and data-driven approaches: An explainable and uncertainty-aware hybrid model for wind turbine power prediction

IF 7.2 2区 物理与天体物理 Q1 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Alfonso Gijón , Simone Eiraudo , Antonio Manjavacas , Daniele Salvatore Schiera , Miguel Molina-Solana , Juan Gómez-Romero
{"title":"Integrating physics and data-driven approaches: An explainable and uncertainty-aware hybrid model for wind turbine power prediction","authors":"Alfonso Gijón ,&nbsp;Simone Eiraudo ,&nbsp;Antonio Manjavacas ,&nbsp;Daniele Salvatore Schiera ,&nbsp;Miguel Molina-Solana ,&nbsp;Juan Gómez-Romero","doi":"10.1016/j.cpc.2025.109761","DOIUrl":null,"url":null,"abstract":"<div><div>The rapid growth of the wind energy sector underscores the urgent need to optimize turbine operations and ensure effective maintenance through early fault detection systems. While traditional empirical and physics-based models offer approximate predictions of power generation based on wind speed, they often fail to capture the complex, non-linear relationships between other input variables and the resulting power output. Data-driven machine learning methods present a promising avenue for improving wind turbine modeling by leveraging large datasets, enhancing prediction accuracy but often at the cost of interpretability. In this study, we propose a hybrid semi-parametric model that combines the strengths of both approaches, applied to a dataset from a wind farm with four turbines. The model integrates a physics-inspired submodel, providing a reasonable approximation of power generation, with a non-parametric submodel that predicts the residuals. This non-parametric submodel is trained on a broader range of variables to account for phenomena not captured by the physics-based component. The hybrid model achieves a 37% improvement in prediction accuracy over the physics-based model and performs comparably to a purely data-driven reference model, while offering additional advantages in terms of explainability and robustness. To further enhance interpretability, SHAP values are used to analyze the influence of input features on the residual submodel's output. Additionally, prediction uncertainties are quantified using a conformalized quantile regression method. The combination of these techniques, alongside the physics grounding of the parametric submodel, provides a flexible, accurate, and reliable framework. Ultimately, this study opens the door for evaluating the impact of unmodeled phenomena on wind turbine power generation, offering a basis for potential optimization.</div></div>","PeriodicalId":285,"journal":{"name":"Computer Physics Communications","volume":"316 ","pages":"Article 109761"},"PeriodicalIF":7.2000,"publicationDate":"2025-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Physics Communications","FirstCategoryId":"101","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010465525002632","RegionNum":2,"RegionCategory":"物理与天体物理","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

The rapid growth of the wind energy sector underscores the urgent need to optimize turbine operations and ensure effective maintenance through early fault detection systems. While traditional empirical and physics-based models offer approximate predictions of power generation based on wind speed, they often fail to capture the complex, non-linear relationships between other input variables and the resulting power output. Data-driven machine learning methods present a promising avenue for improving wind turbine modeling by leveraging large datasets, enhancing prediction accuracy but often at the cost of interpretability. In this study, we propose a hybrid semi-parametric model that combines the strengths of both approaches, applied to a dataset from a wind farm with four turbines. The model integrates a physics-inspired submodel, providing a reasonable approximation of power generation, with a non-parametric submodel that predicts the residuals. This non-parametric submodel is trained on a broader range of variables to account for phenomena not captured by the physics-based component. The hybrid model achieves a 37% improvement in prediction accuracy over the physics-based model and performs comparably to a purely data-driven reference model, while offering additional advantages in terms of explainability and robustness. To further enhance interpretability, SHAP values are used to analyze the influence of input features on the residual submodel's output. Additionally, prediction uncertainties are quantified using a conformalized quantile regression method. The combination of these techniques, alongside the physics grounding of the parametric submodel, provides a flexible, accurate, and reliable framework. Ultimately, this study opens the door for evaluating the impact of unmodeled phenomena on wind turbine power generation, offering a basis for potential optimization.
整合物理和数据驱动的方法:风力涡轮机功率预测的可解释和不确定性意识混合模型
风能行业的快速增长凸显了优化涡轮机运行和通过早期故障检测系统确保有效维护的迫切需要。虽然传统的经验和基于物理的模型提供了基于风速的发电的近似预测,但它们往往无法捕捉到其他输入变量与最终输出功率之间复杂的非线性关系。数据驱动的机器学习方法为利用大数据集改进风力涡轮机建模提供了一条有前途的途径,提高了预测准确性,但往往以可解释性为代价。在本研究中,我们提出了一种混合半参数模型,结合了两种方法的优势,并应用于具有四个涡轮机的风电场的数据集。该模型集成了一个物理启发的子模型,提供了发电的合理近似值,以及一个预测残差的非参数子模型。这个非参数子模型在更广泛的变量范围上进行训练,以解释基于物理的组件未捕获的现象。与基于物理的模型相比,混合模型的预测精度提高了37%,与纯数据驱动的参考模型相比,混合模型在可解释性和鲁棒性方面具有额外的优势。为了进一步增强可解释性,我们使用SHAP值来分析输入特征对残差子模型输出的影响。此外,预测的不确定性是量化使用符合分位数回归方法。这些技术的结合,以及参数子模型的物理基础,提供了一个灵活、准确和可靠的框架。最终,本研究为评估未建模现象对风力发电的影响打开了大门,为潜在的优化提供了依据。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Computer Physics Communications
Computer Physics Communications 物理-计算机:跨学科应用
CiteScore
12.10
自引率
3.20%
发文量
287
审稿时长
5.3 months
期刊介绍: The focus of CPC is on contemporary computational methods and techniques and their implementation, the effectiveness of which will normally be evidenced by the author(s) within the context of a substantive problem in physics. Within this setting CPC publishes two types of paper. Computer Programs in Physics (CPiP) These papers describe significant computer programs to be archived in the CPC Program Library which is held in the Mendeley Data repository. The submitted software must be covered by an approved open source licence. Papers and associated computer programs that address a problem of contemporary interest in physics that cannot be solved by current software are particularly encouraged. Computational Physics Papers (CP) These are research papers in, but are not limited to, the following themes across computational physics and related disciplines. mathematical and numerical methods and algorithms; computational models including those associated with the design, control and analysis of experiments; and algebraic computation. Each will normally include software implementation and performance details. The software implementation should, ideally, be available via GitHub, Zenodo or an institutional repository.In addition, research papers on the impact of advanced computer architecture and special purpose computers on computing in the physical sciences and software topics related to, and of importance in, the physical sciences may be considered.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信