Wenliang Yin , Mengqian Jia , Lin Liu , Ming Li , Youguang Guo , Gang Lei , Jian Guo Zhu
{"title":"Advanced power curve modeling for wind turbines: A multivariable approach with SGBRT and grey wolf optimization","authors":"Wenliang Yin , Mengqian Jia , Lin Liu , Ming Li , Youguang Guo , Gang Lei , Jian Guo Zhu","doi":"10.1016/j.enconman.2025.119680","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate power curve modeling is crucial for improving the operational efficiency and performance of grid-connected wind turbines (WTs). To enhance the modeling quality and eliminate input variable interactions, this paper proposes a novel multivariable power curve prediction approach that integrates advanced machine learning techniques, namely stochastic gradient boosting regression tree (SGBRT) and grey wolf optimization (GWO), with innovative data preprocessing and feature selection methods. The specific works and novelties are as follows. 1) The raw data is cleaned in a two-dimensional Copula space, using wind wheel speed as an auxiliary criterion and a probabilistic description, to handle data uncertainties and nonlinear dependencies. 2) A partial mutual information (PMI) method is presented for data characteristics analysis, based on which eight significant parameters are selected as modeling input variables, reducing computational complexity while enhancing prediction accuracy. 3) A power curve prediction model considering multiple input variables is established using SGBRT, and its hyperparameters are optimized through a GWO algorithm, guided by a fitness function combining the indicators of root mean square error (RMSE), mean absolute error (MAE) and R squared (R<sup>2</sup>). 4) Validated with real SCADA data from WTs in service, the proposed model achieves superior performance, with the smallest standardized residuals (6.56 %), RMSE (around 27 kW), MAE (19.27 kW), and superior average R<sup>2</sup> (98.61 %) for all speed regions. Comparative studies indicate that the proposed approach outperforms existing methods, offering significant improvements in accuracy, efficiency, robustness and adaptability for WT power curve modeling.</div></div>","PeriodicalId":11664,"journal":{"name":"Energy Conversion and Management","volume":"332 ","pages":"Article 119680"},"PeriodicalIF":9.9000,"publicationDate":"2025-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Conversion and Management","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0196890425002031","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Accurate power curve modeling is crucial for improving the operational efficiency and performance of grid-connected wind turbines (WTs). To enhance the modeling quality and eliminate input variable interactions, this paper proposes a novel multivariable power curve prediction approach that integrates advanced machine learning techniques, namely stochastic gradient boosting regression tree (SGBRT) and grey wolf optimization (GWO), with innovative data preprocessing and feature selection methods. The specific works and novelties are as follows. 1) The raw data is cleaned in a two-dimensional Copula space, using wind wheel speed as an auxiliary criterion and a probabilistic description, to handle data uncertainties and nonlinear dependencies. 2) A partial mutual information (PMI) method is presented for data characteristics analysis, based on which eight significant parameters are selected as modeling input variables, reducing computational complexity while enhancing prediction accuracy. 3) A power curve prediction model considering multiple input variables is established using SGBRT, and its hyperparameters are optimized through a GWO algorithm, guided by a fitness function combining the indicators of root mean square error (RMSE), mean absolute error (MAE) and R squared (R2). 4) Validated with real SCADA data from WTs in service, the proposed model achieves superior performance, with the smallest standardized residuals (6.56 %), RMSE (around 27 kW), MAE (19.27 kW), and superior average R2 (98.61 %) for all speed regions. Comparative studies indicate that the proposed approach outperforms existing methods, offering significant improvements in accuracy, efficiency, robustness and adaptability for WT power curve modeling.
准确的功率曲线建模对于提高并网风力发电机组的运行效率和性能至关重要。为了提高建模质量并消除输入变量之间的相互作用,本文提出了一种新的多变量功率曲线预测方法,该方法结合了先进的机器学习技术,即随机梯度增强回归树(SGBRT)和灰狼优化(GWO),以及创新的数据预处理和特征选择方法。具体作品和创新点如下:1)在二维Copula空间中对原始数据进行清理,以风轮转速为辅助判据,采用概率描述,处理数据的不确定性和非线性依赖关系。2)提出了一种部分互信息(PMI)方法进行数据特征分析,在此基础上选取8个显著参数作为建模输入变量,降低了计算复杂度,提高了预测精度。3)利用SGBRT建立了考虑多输入变量的功率曲线预测模型,并在均方根误差(RMSE)、平均绝对误差(MAE)和R平方(R2)指标组合的适应度函数的指导下,通过GWO算法对模型的超参数进行优化。4)使用现役WTs的真实SCADA数据进行验证,所提出的模型具有优异的性能,在所有速度区域具有最小的标准化残差(6.56%),RMSE(约27 kW), MAE (19.27 kW)和优越的平均R2(98.61%)。对比研究表明,该方法优于现有方法,在WT功率曲线建模的准确性、效率、鲁棒性和适应性方面都有显著提高。
期刊介绍:
The journal Energy Conversion and Management provides a forum for publishing original contributions and comprehensive technical review articles of interdisciplinary and original research on all important energy topics.
The topics considered include energy generation, utilization, conversion, storage, transmission, conservation, management and sustainability. These topics typically involve various types of energy such as mechanical, thermal, nuclear, chemical, electromagnetic, magnetic and electric. These energy types cover all known energy resources, including renewable resources (e.g., solar, bio, hydro, wind, geothermal and ocean energy), fossil fuels and nuclear resources.