Fatma Mazen Ali Mazen, Yomna Shaker, Rania Ahmed Abul Seoud
{"title":"A Two-Stage Model for Outlier Detection and Power Prediction of Wind Turbine Using SCADA Dataset","authors":"Fatma Mazen Ali Mazen, Yomna Shaker, Rania Ahmed Abul Seoud","doi":"10.1155/er/2527561","DOIUrl":null,"url":null,"abstract":"<p>As wind energy adoption grows, ensuring reliable power generation becomes critical. However, its inherent variability, caused by fluctuations in wind speed, direction, and environmental conditions, poses challenges for grid integration and operational planning. To mitigate these issues, researchers have developed various forecasting models and techniques for improving wind power prediction. Wind power forecasting enables more accurate power scheduling, reduces operational costs, and improves grid stability. Datasets derived from wind turbines’ Supervisory Control and Data Acquisition (SCADA) offer high-resolution and real-time measurements of critical operational parameters, including wind speed and power output. Outliers in these datasets frequently stem from sensor faults, data transmission issues, extreme environmental conditions, or atypical turbine operations. This makes outlier detection vital for data integrity and effective maintenance planning. This paper proposes a two-stage model: one stage for outlier detection followed by another stage for power prediction using the Kaggle SCADA dataset. Using One-Class support vector machine (SVM) for outlier detection marks a significant advancement, as it facilitates the identification of 6588 anomalies without requiring labeled data. By exploiting inherent physical relationships between wind speed and power output, the incorporation of innovative features such as cubic wind speed and residual power enhances the model’s predictive accuracy. The findings underscore the efficacy of the proposed methodology, evidenced by a mean squared error (MSE) of 8809.20 kW, which reflects a significant level of accuracy in forecasting power output. Furthermore, the mean absolute error (MAE) of 63.35 kW which indicates that, on average, the predicted wind power output deviates from the actual output by this amount, reflecting the model’s accuracy in forecasting. The high coefficient of determination (<i>R</i> <i> </i><sup>2</sup>) value of 0.9947 demonstrates an excellent model fit to the observed data, accounting for approximately 99.47% of the variance.</p>","PeriodicalId":14051,"journal":{"name":"International Journal of Energy Research","volume":"2025 1","pages":""},"PeriodicalIF":4.3000,"publicationDate":"2025-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1155/er/2527561","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Energy Research","FirstCategoryId":"5","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1155/er/2527561","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
As wind energy adoption grows, ensuring reliable power generation becomes critical. However, its inherent variability, caused by fluctuations in wind speed, direction, and environmental conditions, poses challenges for grid integration and operational planning. To mitigate these issues, researchers have developed various forecasting models and techniques for improving wind power prediction. Wind power forecasting enables more accurate power scheduling, reduces operational costs, and improves grid stability. Datasets derived from wind turbines’ Supervisory Control and Data Acquisition (SCADA) offer high-resolution and real-time measurements of critical operational parameters, including wind speed and power output. Outliers in these datasets frequently stem from sensor faults, data transmission issues, extreme environmental conditions, or atypical turbine operations. This makes outlier detection vital for data integrity and effective maintenance planning. This paper proposes a two-stage model: one stage for outlier detection followed by another stage for power prediction using the Kaggle SCADA dataset. Using One-Class support vector machine (SVM) for outlier detection marks a significant advancement, as it facilitates the identification of 6588 anomalies without requiring labeled data. By exploiting inherent physical relationships between wind speed and power output, the incorporation of innovative features such as cubic wind speed and residual power enhances the model’s predictive accuracy. The findings underscore the efficacy of the proposed methodology, evidenced by a mean squared error (MSE) of 8809.20 kW, which reflects a significant level of accuracy in forecasting power output. Furthermore, the mean absolute error (MAE) of 63.35 kW which indicates that, on average, the predicted wind power output deviates from the actual output by this amount, reflecting the model’s accuracy in forecasting. The high coefficient of determination (R2) value of 0.9947 demonstrates an excellent model fit to the observed data, accounting for approximately 99.47% of the variance.
期刊介绍:
The International Journal of Energy Research (IJER) is dedicated to providing a multidisciplinary, unique platform for researchers, scientists, engineers, technology developers, planners, and policy makers to present their research results and findings in a compelling manner on novel energy systems and applications. IJER covers the entire spectrum of energy from production to conversion, conservation, management, systems, technologies, etc. We encourage papers submissions aiming at better efficiency, cost improvements, more effective resource use, improved design and analysis, reduced environmental impact, and hence leading to better sustainability.
IJER is concerned with the development and exploitation of both advanced traditional and new energy sources, systems, technologies and applications. Interdisciplinary subjects in the area of novel energy systems and applications are also encouraged. High-quality research papers are solicited in, but are not limited to, the following areas with innovative and novel contents:
-Biofuels and alternatives
-Carbon capturing and storage technologies
-Clean coal technologies
-Energy conversion, conservation and management
-Energy storage
-Energy systems
-Hybrid/combined/integrated energy systems for multi-generation
-Hydrogen energy and fuel cells
-Hydrogen production technologies
-Micro- and nano-energy systems and technologies
-Nuclear energy
-Renewable energies (e.g. geothermal, solar, wind, hydro, tidal, wave, biomass)
-Smart energy system