基于时间特征集成的电力负荷预测元学习框架

Q2 Energy

Energy Informatics Pub Date : 2025-08-28 DOI:10.1186/s42162-025-00572-y

Rakesh Salakapuri, Thirukkavalluru Pavankumar

{"title":"基于时间特征集成的电力负荷预测元学习框架","authors":"Rakesh Salakapuri, Thirukkavalluru Pavankumar","doi":"10.1186/s42162-025-00572-y","DOIUrl":null,"url":null,"abstract":"<div><p>Accurate electricity load forecasting is essential for the stability, efficiency, and sustainability of modern power systems. However, individual forecasting models often lack generalization across temporal and regional variations and offer limited interpretability. This study proposes a comprehensive meta-learning-based forecast combination framework to enhance both prediction accuracy and model transparency. Using hourly load data from 20 European countries spanning 2018 to 2024, the framework incorporates time-aware features such as hour of the day, day of the week, month, and public holidays. Ten diverse base models—including XGBoost, LightGBM, Random Forest, and LSTM—are trained globally, from which the top five performers are selected (based on R², MAE, and MAPE) and fed into five meta-learners: Ridge Regression, Lasso, Random Forest, Gradient Boosting, and MLP. These meta-models are trained using both model predictions and engineered time features. Experimental results demonstrate superior performance, with the best-performing meta-learner (Random Forest Regressor) achieving a coefficient of determination (R²) of 0.9998 and a Mean Absolute Percentage Error (MAPE) of 0.79%, significantly outperforming traditional ensemble methods. Furthermore, the inclusion of lag features and 5-fold cross-validation led to substantial improvements across all models, including dramatic reductions in MAE (up to 87%), MAPE (up to 88%), and MSE (up to 97%), along with near-perfect R² scores (~ 1.000). Additionally, SHAP-based explainability reveals the contribution of individual time-based features and the influence of each base model within the ensemble, thereby enhancing transparency and supporting practical decision-making.</p></div>","PeriodicalId":538,"journal":{"name":"Energy Informatics","volume":"8 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://energyinformatics.springeropen.com/counter/pdf/10.1186/s42162-025-00572-y","citationCount":"0","resultStr":"{\"title\":\"A meta-learning framework with temporal feature integration for electricity load forecasting\",\"authors\":\"Rakesh Salakapuri, Thirukkavalluru Pavankumar\",\"doi\":\"10.1186/s42162-025-00572-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Accurate electricity load forecasting is essential for the stability, efficiency, and sustainability of modern power systems. However, individual forecasting models often lack generalization across temporal and regional variations and offer limited interpretability. This study proposes a comprehensive meta-learning-based forecast combination framework to enhance both prediction accuracy and model transparency. Using hourly load data from 20 European countries spanning 2018 to 2024, the framework incorporates time-aware features such as hour of the day, day of the week, month, and public holidays. Ten diverse base models—including XGBoost, LightGBM, Random Forest, and LSTM—are trained globally, from which the top five performers are selected (based on R², MAE, and MAPE) and fed into five meta-learners: Ridge Regression, Lasso, Random Forest, Gradient Boosting, and MLP. These meta-models are trained using both model predictions and engineered time features. Experimental results demonstrate superior performance, with the best-performing meta-learner (Random Forest Regressor) achieving a coefficient of determination (R²) of 0.9998 and a Mean Absolute Percentage Error (MAPE) of 0.79%, significantly outperforming traditional ensemble methods. Furthermore, the inclusion of lag features and 5-fold cross-validation led to substantial improvements across all models, including dramatic reductions in MAE (up to 87%), MAPE (up to 88%), and MSE (up to 97%), along with near-perfect R² scores (~ 1.000). Additionally, SHAP-based explainability reveals the contribution of individual time-based features and the influence of each base model within the ensemble, thereby enhancing transparency and supporting practical decision-making.</p></div>\",\"PeriodicalId\":538,\"journal\":{\"name\":\"Energy Informatics\",\"volume\":\"8 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2025-08-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://energyinformatics.springeropen.com/counter/pdf/10.1186/s42162-025-00572-y\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Energy Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://link.springer.com/article/10.1186/s42162-025-00572-y\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Energy\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Informatics","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1186/s42162-025-00572-y","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Energy","Score":null,"Total":0}

引用次数: 0

摘要

准确的电力负荷预测对现代电力系统的稳定性、高效性和可持续性至关重要。然而，个别预测模式往往缺乏跨时间和区域变化的通用性，可解释性有限。本研究提出了一种基于元学习的综合预测组合框架，以提高预测精度和模型透明度。该框架利用2018年至2024年20个欧洲国家的每小时负荷数据，结合了时间感知特征，如一天中的小时、一周中的哪一天、一月中的哪一天和公共假日。10个不同的基本模型——包括XGBoost、LightGBM、Random Forest和lstm——在全球范围内进行训练，从中选出表现最好的5个模型（基于R²、MAE和MAPE），并将其输入5个元学习器：Ridge Regression、Lasso、Random Forest、Gradient Boosting和MLP。这些元模型使用模型预测和工程时间特征进行训练。实验结果表明，表现最好的元学习器（随机森林回归器）的决定系数（R²）为0.9998，平均绝对百分比误差（MAPE）为0.79%，显著优于传统的集成方法。此外，包含滞后特征和5倍交叉验证导致所有模型的显著改进，包括MAE（高达87%），MAPE（高达88%）和MSE（高达97%）的显着降低，以及接近完美的R²分数（~ 1.000）。此外，基于shap的可解释性揭示了单个基于时间的特征的贡献以及每个基本模型在集成中的影响，从而提高了透明度并支持实际决策。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A meta-learning framework with temporal feature integration for electricity load forecasting

Accurate electricity load forecasting is essential for the stability, efficiency, and sustainability of modern power systems. However, individual forecasting models often lack generalization across temporal and regional variations and offer limited interpretability. This study proposes a comprehensive meta-learning-based forecast combination framework to enhance both prediction accuracy and model transparency. Using hourly load data from 20 European countries spanning 2018 to 2024, the framework incorporates time-aware features such as hour of the day, day of the week, month, and public holidays. Ten diverse base models—including XGBoost, LightGBM, Random Forest, and LSTM—are trained globally, from which the top five performers are selected (based on R², MAE, and MAPE) and fed into five meta-learners: Ridge Regression, Lasso, Random Forest, Gradient Boosting, and MLP. These meta-models are trained using both model predictions and engineered time features. Experimental results demonstrate superior performance, with the best-performing meta-learner (Random Forest Regressor) achieving a coefficient of determination (R²) of 0.9998 and a Mean Absolute Percentage Error (MAPE) of 0.79%, significantly outperforming traditional ensemble methods. Furthermore, the inclusion of lag features and 5-fold cross-validation led to substantial improvements across all models, including dramatic reductions in MAE (up to 87%), MAPE (up to 88%), and MSE (up to 97%), along with near-perfect R² scores (~ 1.000). Additionally, SHAP-based explainability reveals the contribution of individual time-based features and the influence of each base model within the ensemble, thereby enhancing transparency and supporting practical decision-making.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Energy Informatics Computer Science-Computer Networks and Communications

CiteScore

5.50

自引率

0.00%

发文量

审稿时长

5 weeks