A Hybrid machine learning–statistical based method for short-term energy consumption prediction in residential buildings

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Kamran Hassanpouri Baesmat, Emma E. Regentova, Yahia Baghzouz
{"title":"A Hybrid machine learning–statistical based method for short-term energy consumption prediction in residential buildings","authors":"Kamran Hassanpouri Baesmat,&nbsp;Emma E. Regentova,&nbsp;Yahia Baghzouz","doi":"10.1016/j.egyai.2025.100552","DOIUrl":null,"url":null,"abstract":"<div><div>Accurate short-term load forecasting is essential for modern power systems, enabling efficient energy management and supporting grid reliability amid increasing demand and variable weather conditions. This study addresses the challenge of forecasting household electricity consumption by proposing SSRXLR—a novel hybrid method that integrates statistical and machine learning techniques including a sparse, Seasonal Autoregressive Integrated Moving Average Exogenous model, Random Forest, Extreme Gradient Boosting, Long Short-Term Memory, and a Residual Correction step to leverage both linear trends and complex nonlinear relationships. We have analyzed one year of high-resolution (5-minute interval) energy and weather data from a household in Las Vegas, Nevada. Through a rigorous feature selection process, we have identified the four most influential features, i.e., sea level pressure, temperature, feels-like temperature, and dew point. The proposed method has demonstrated strong prediction performance across multiple metrics. Compared to well-known models, the proposed method achieved a root mean square logarithmic error of 0.043, which surpassed the Random Forest method by 0.066 and the Seasonal Autoregressive Integrated Moving Average Exogenous model by 0.106 in reducing the Root Mean Square Logarithmic Error (RMSLE). The coefficient of determination for the proposed method attained a 0.97 value, outperforming Random Forest (0.92) and the Seasonal Autoregressive Integrated Moving Average Exogenous model (0.67). These results highlight the effectiveness of combining advanced statistical modeling, machine learning, and targeted feature selection for precise short-term load forecasting. The proposed framework offers a scalable solution for smart grid operations, resource planning, and integration of renewable energy in diverse environments.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"21 ","pages":"Article 100552"},"PeriodicalIF":9.6000,"publicationDate":"2025-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825000849","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Accurate short-term load forecasting is essential for modern power systems, enabling efficient energy management and supporting grid reliability amid increasing demand and variable weather conditions. This study addresses the challenge of forecasting household electricity consumption by proposing SSRXLR—a novel hybrid method that integrates statistical and machine learning techniques including a sparse, Seasonal Autoregressive Integrated Moving Average Exogenous model, Random Forest, Extreme Gradient Boosting, Long Short-Term Memory, and a Residual Correction step to leverage both linear trends and complex nonlinear relationships. We have analyzed one year of high-resolution (5-minute interval) energy and weather data from a household in Las Vegas, Nevada. Through a rigorous feature selection process, we have identified the four most influential features, i.e., sea level pressure, temperature, feels-like temperature, and dew point. The proposed method has demonstrated strong prediction performance across multiple metrics. Compared to well-known models, the proposed method achieved a root mean square logarithmic error of 0.043, which surpassed the Random Forest method by 0.066 and the Seasonal Autoregressive Integrated Moving Average Exogenous model by 0.106 in reducing the Root Mean Square Logarithmic Error (RMSLE). The coefficient of determination for the proposed method attained a 0.97 value, outperforming Random Forest (0.92) and the Seasonal Autoregressive Integrated Moving Average Exogenous model (0.67). These results highlight the effectiveness of combining advanced statistical modeling, machine learning, and targeted feature selection for precise short-term load forecasting. The proposed framework offers a scalable solution for smart grid operations, resource planning, and integration of renewable energy in diverse environments.

Abstract Image

基于机器学习-统计的住宅建筑短期能耗预测方法
准确的短期负荷预测对于现代电力系统至关重要,它可以实现高效的能源管理,并在不断增长的需求和多变的天气条件下支持电网的可靠性。本研究通过提出ssrxlr来解决预测家庭用电量的挑战,ssrxlr是一种新的混合方法,它集成了统计和机器学习技术,包括稀疏、季节性自回归综合移动平均外生模型、随机森林、极端梯度增强、长短期记忆和残差校正步骤,以利用线性趋势和复杂的非线性关系。我们分析了内华达州拉斯维加斯一户家庭一年的高分辨率(间隔5分钟)能源和天气数据。通过严格的特征选择过程,我们确定了四个最具影响力的特征,即海平面压力、温度、感觉温度和露点。该方法在多个指标上具有较强的预测性能。与已有模型相比,该方法的均方根对数误差为0.043,比随机森林方法低0.066,比季节自回归综合移动平均外源模型低0.106。该方法的决定系数达到0.97,优于随机森林模型(0.92)和季节性自回归综合移动平均外源模型(0.67)。这些结果强调了将先进的统计建模、机器学习和有针对性的特征选择结合起来进行精确的短期负荷预测的有效性。提出的框架为智能电网运行、资源规划和不同环境下可再生能源的整合提供了可扩展的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Energy and AI
Energy and AI Engineering-Engineering (miscellaneous)
CiteScore
16.50
自引率
0.00%
发文量
64
审稿时长
56 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信