Fatma Mazen Ali Mazen, Yomna Shaker, Rania Ahmed Abul Seoud
{"title":"基于SCADA数据集的风电离群值检测与功率预测两阶段模型","authors":"Fatma Mazen Ali Mazen, Yomna Shaker, Rania Ahmed Abul Seoud","doi":"10.1155/er/2527561","DOIUrl":null,"url":null,"abstract":"<p>As wind energy adoption grows, ensuring reliable power generation becomes critical. However, its inherent variability, caused by fluctuations in wind speed, direction, and environmental conditions, poses challenges for grid integration and operational planning. To mitigate these issues, researchers have developed various forecasting models and techniques for improving wind power prediction. Wind power forecasting enables more accurate power scheduling, reduces operational costs, and improves grid stability. Datasets derived from wind turbines’ Supervisory Control and Data Acquisition (SCADA) offer high-resolution and real-time measurements of critical operational parameters, including wind speed and power output. Outliers in these datasets frequently stem from sensor faults, data transmission issues, extreme environmental conditions, or atypical turbine operations. This makes outlier detection vital for data integrity and effective maintenance planning. This paper proposes a two-stage model: one stage for outlier detection followed by another stage for power prediction using the Kaggle SCADA dataset. Using One-Class support vector machine (SVM) for outlier detection marks a significant advancement, as it facilitates the identification of 6588 anomalies without requiring labeled data. By exploiting inherent physical relationships between wind speed and power output, the incorporation of innovative features such as cubic wind speed and residual power enhances the model’s predictive accuracy. The findings underscore the efficacy of the proposed methodology, evidenced by a mean squared error (MSE) of 8809.20 kW, which reflects a significant level of accuracy in forecasting power output. Furthermore, the mean absolute error (MAE) of 63.35 kW which indicates that, on average, the predicted wind power output deviates from the actual output by this amount, reflecting the model’s accuracy in forecasting. The high coefficient of determination (<i>R</i> <i> </i><sup>2</sup>) value of 0.9947 demonstrates an excellent model fit to the observed data, accounting for approximately 99.47% of the variance.</p>","PeriodicalId":14051,"journal":{"name":"International Journal of Energy Research","volume":"2025 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/er/2527561","citationCount":"0","resultStr":"{\"title\":\"A Two-Stage Model for Outlier Detection and Power Prediction of Wind Turbine Using SCADA Dataset\",\"authors\":\"Fatma Mazen Ali Mazen, Yomna Shaker, Rania Ahmed Abul Seoud\",\"doi\":\"10.1155/er/2527561\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>As wind energy adoption grows, ensuring reliable power generation becomes critical. However, its inherent variability, caused by fluctuations in wind speed, direction, and environmental conditions, poses challenges for grid integration and operational planning. To mitigate these issues, researchers have developed various forecasting models and techniques for improving wind power prediction. Wind power forecasting enables more accurate power scheduling, reduces operational costs, and improves grid stability. Datasets derived from wind turbines’ Supervisory Control and Data Acquisition (SCADA) offer high-resolution and real-time measurements of critical operational parameters, including wind speed and power output. Outliers in these datasets frequently stem from sensor faults, data transmission issues, extreme environmental conditions, or atypical turbine operations. This makes outlier detection vital for data integrity and effective maintenance planning. This paper proposes a two-stage model: one stage for outlier detection followed by another stage for power prediction using the Kaggle SCADA dataset. Using One-Class support vector machine (SVM) for outlier detection marks a significant advancement, as it facilitates the identification of 6588 anomalies without requiring labeled data. By exploiting inherent physical relationships between wind speed and power output, the incorporation of innovative features such as cubic wind speed and residual power enhances the model’s predictive accuracy. The findings underscore the efficacy of the proposed methodology, evidenced by a mean squared error (MSE) of 8809.20 kW, which reflects a significant level of accuracy in forecasting power output. Furthermore, the mean absolute error (MAE) of 63.35 kW which indicates that, on average, the predicted wind power output deviates from the actual output by this amount, reflecting the model’s accuracy in forecasting. The high coefficient of determination (<i>R</i> <i> </i><sup>2</sup>) value of 0.9947 demonstrates an excellent model fit to the observed data, accounting for approximately 99.47% of the variance.</p>\",\"PeriodicalId\":14051,\"journal\":{\"name\":\"International Journal of Energy Research\",\"volume\":\"2025 1\",\"pages\":\"\"},\"PeriodicalIF\":4.3000,\"publicationDate\":\"2025-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1155/er/2527561\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Energy Research\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1155/er/2527561\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENERGY & FUELS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Energy Research","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/er/2527561","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
摘要
随着风能的普及,确保可靠的发电变得至关重要。然而,由于风速、风向和环境条件的波动,其固有的可变性给电网整合和运行规划带来了挑战。为了缓解这些问题,研究人员开发了各种预测模型和技术来改进风力预测。风电预测可以实现更准确的电力调度,降低运营成本,提高电网稳定性。来自风力涡轮机监控和数据采集(SCADA)的数据集提供了关键运行参数的高分辨率和实时测量,包括风速和功率输出。这些数据集中的异常值通常源于传感器故障、数据传输问题、极端环境条件或非典型涡轮机操作。这使得异常值检测对于数据完整性和有效的维护计划至关重要。本文提出了一个两阶段模型:一个阶段用于异常值检测,另一个阶段用于使用Kaggle SCADA数据集进行功率预测。使用一类支持向量机(One-Class support vector machine, SVM)进行离群值检测标志着一项重大进步,因为它有助于在不需要标记数据的情况下识别6588个异常。通过利用风速和输出功率之间的内在物理关系,结合创新特征,如立方风速和剩余功率,提高了模型的预测精度。研究结果强调了所提出方法的有效性,均方误差(MSE)为8809.20 kW,这反映了预测功率输出的显著准确性。平均绝对误差(MAE)为63.35 kW,表明预测风电输出与实际输出的平均偏差为63.35 kW,反映了模型预测的准确性。较高的决定系数(r2)值为0.9947,表明模型与观测数据拟合良好,约占方差的99.47%。
A Two-Stage Model for Outlier Detection and Power Prediction of Wind Turbine Using SCADA Dataset
As wind energy adoption grows, ensuring reliable power generation becomes critical. However, its inherent variability, caused by fluctuations in wind speed, direction, and environmental conditions, poses challenges for grid integration and operational planning. To mitigate these issues, researchers have developed various forecasting models and techniques for improving wind power prediction. Wind power forecasting enables more accurate power scheduling, reduces operational costs, and improves grid stability. Datasets derived from wind turbines’ Supervisory Control and Data Acquisition (SCADA) offer high-resolution and real-time measurements of critical operational parameters, including wind speed and power output. Outliers in these datasets frequently stem from sensor faults, data transmission issues, extreme environmental conditions, or atypical turbine operations. This makes outlier detection vital for data integrity and effective maintenance planning. This paper proposes a two-stage model: one stage for outlier detection followed by another stage for power prediction using the Kaggle SCADA dataset. Using One-Class support vector machine (SVM) for outlier detection marks a significant advancement, as it facilitates the identification of 6588 anomalies without requiring labeled data. By exploiting inherent physical relationships between wind speed and power output, the incorporation of innovative features such as cubic wind speed and residual power enhances the model’s predictive accuracy. The findings underscore the efficacy of the proposed methodology, evidenced by a mean squared error (MSE) of 8809.20 kW, which reflects a significant level of accuracy in forecasting power output. Furthermore, the mean absolute error (MAE) of 63.35 kW which indicates that, on average, the predicted wind power output deviates from the actual output by this amount, reflecting the model’s accuracy in forecasting. The high coefficient of determination (R2) value of 0.9947 demonstrates an excellent model fit to the observed data, accounting for approximately 99.47% of the variance.
期刊介绍:
The International Journal of Energy Research (IJER) is dedicated to providing a multidisciplinary, unique platform for researchers, scientists, engineers, technology developers, planners, and policy makers to present their research results and findings in a compelling manner on novel energy systems and applications. IJER covers the entire spectrum of energy from production to conversion, conservation, management, systems, technologies, etc. We encourage papers submissions aiming at better efficiency, cost improvements, more effective resource use, improved design and analysis, reduced environmental impact, and hence leading to better sustainability.
IJER is concerned with the development and exploitation of both advanced traditional and new energy sources, systems, technologies and applications. Interdisciplinary subjects in the area of novel energy systems and applications are also encouraged. High-quality research papers are solicited in, but are not limited to, the following areas with innovative and novel contents:
-Biofuels and alternatives
-Carbon capturing and storage technologies
-Clean coal technologies
-Energy conversion, conservation and management
-Energy storage
-Energy systems
-Hybrid/combined/integrated energy systems for multi-generation
-Hydrogen energy and fuel cells
-Hydrogen production technologies
-Micro- and nano-energy systems and technologies
-Nuclear energy
-Renewable energies (e.g. geothermal, solar, wind, hydro, tidal, wave, biomass)
-Smart energy system