Predicting Energy Generation in Large Wind Farms: A Data-Driven Study with Open Data and Machine Learning

IF 2.1 Q2 ENGINEERING, MULTIDISCIPLINARY
Matheus Paula, Wallace Casaca, Marilaine Colnago, José R. da Silva, Kleber Oliveira, Mauricio A. Dias, Rogério Negri
{"title":"Predicting Energy Generation in Large Wind Farms: A Data-Driven Study with Open Data and Machine Learning","authors":"Matheus Paula, Wallace Casaca, Marilaine Colnago, José R. da Silva, Kleber Oliveira, Mauricio A. Dias, Rogério Negri","doi":"10.3390/inventions8050126","DOIUrl":null,"url":null,"abstract":"Wind energy has become a trend in Brazil, particularly in the northeastern region of the country. Despite its advantages, wind power generation has been hindered by the high volatility of exogenous factors, such as weather, temperature, and air humidity, making long-term forecasting a highly challenging task. Another issue is the need for reliable solutions, especially for large-scale wind farms, as this involves integrating specific optimization tools and restricted-access datasets collected locally at the power plants. Therefore, in this paper, the problem of forecasting the energy generated at the Praia Formosa wind farm, an eco-friendly park located in the state of Ceará, Brazil, which produces around 7% of the state’s electricity, was addressed. To proceed with our data-driven analysis, publicly available data were collected from multiple Brazilian official sources, combining them into a unified database to perform exploratory data analysis and predictive modeling. Specifically, three machine-learning-based approaches were applied: Extreme Gradient Boosting, Random Forest, and Long Short-Term Memory Network, as well as feature-engineering strategies to enhance the precision of the machine intelligence models, including creating artificial features and tuning the hyperparameters. Our findings revealed that all implemented models successfully captured the energy-generation trends, patterns, and seasonality from the complex wind data. However, it was found that the LSTM-based model consistently outperformed the others, achieving a promising global MAPE of 4.55%, highlighting its accuracy in long-term wind energy forecasting. Temperature, relative humidity, and wind speed were identified as the key factors influencing electricity production, with peak generation typically occurring from August to November.","PeriodicalId":14564,"journal":{"name":"Inventions","volume":null,"pages":null},"PeriodicalIF":2.1000,"publicationDate":"2023-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Inventions","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/inventions8050126","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

Abstract

Wind energy has become a trend in Brazil, particularly in the northeastern region of the country. Despite its advantages, wind power generation has been hindered by the high volatility of exogenous factors, such as weather, temperature, and air humidity, making long-term forecasting a highly challenging task. Another issue is the need for reliable solutions, especially for large-scale wind farms, as this involves integrating specific optimization tools and restricted-access datasets collected locally at the power plants. Therefore, in this paper, the problem of forecasting the energy generated at the Praia Formosa wind farm, an eco-friendly park located in the state of Ceará, Brazil, which produces around 7% of the state’s electricity, was addressed. To proceed with our data-driven analysis, publicly available data were collected from multiple Brazilian official sources, combining them into a unified database to perform exploratory data analysis and predictive modeling. Specifically, three machine-learning-based approaches were applied: Extreme Gradient Boosting, Random Forest, and Long Short-Term Memory Network, as well as feature-engineering strategies to enhance the precision of the machine intelligence models, including creating artificial features and tuning the hyperparameters. Our findings revealed that all implemented models successfully captured the energy-generation trends, patterns, and seasonality from the complex wind data. However, it was found that the LSTM-based model consistently outperformed the others, achieving a promising global MAPE of 4.55%, highlighting its accuracy in long-term wind energy forecasting. Temperature, relative humidity, and wind speed were identified as the key factors influencing electricity production, with peak generation typically occurring from August to November.
预测大型风电场的发电量:基于开放数据和机器学习的数据驱动研究
风能在巴西已经成为一种趋势,特别是在该国的东北部地区。尽管风力发电具有优势,但由于天气、温度和空气湿度等外生因素的高度波动性,风力发电一直受到阻碍,这使得长期预测成为一项极具挑战性的任务。另一个问题是需要可靠的解决方案,特别是对于大型风电场,因为这涉及到集成特定的优化工具和在发电厂本地收集的限制访问数据集。因此,在本文中,预测在普拉亚福尔摩沙风电场产生的能量的问题,位于巴西塞埃尔州的一个环保公园,产生约7%的国家电力,是解决。为了继续进行数据驱动分析,我们从多个巴西官方来源收集了公开可用的数据,并将其合并到一个统一的数据库中,以进行探索性数据分析和预测建模。具体来说,应用了三种基于机器学习的方法:极端梯度增强、随机森林和长短期记忆网络,以及特征工程策略来提高机器智能模型的精度,包括创建人工特征和调整超参数。我们的研究结果表明,所有实施的模型都成功地从复杂的风数据中捕获了能源产生的趋势、模式和季节性。然而,我们发现基于lstm的模型始终优于其他模型,实现了4.55%的有希望的全球MAPE,突出了其在长期风能预测中的准确性。温度、相对湿度和风速是影响发电量的关键因素,峰值通常发生在8 - 11月。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Inventions
Inventions Engineering-Engineering (all)
CiteScore
4.80
自引率
11.80%
发文量
91
审稿时长
12 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信