Transfer learning on transformers for building energy consumption forecasting—A comparative study

IF 6.6 2区工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY

Energy and Buildings Pub Date : 2025-03-18 DOI:10.1016/j.enbuild.2025.115632

Robert Spencer , Surangika Ranathunga , Mikael Boulic , Andries (Hennie) van Heerden , Teo Susnjak

{"title":"Transfer learning on transformers for building energy consumption forecasting—A comparative study","authors":"Robert Spencer , Surangika Ranathunga , Mikael Boulic , Andries (Hennie) van Heerden , Teo Susnjak","doi":"10.1016/j.enbuild.2025.115632","DOIUrl":null,"url":null,"abstract":"<div><div>Energy consumption in buildings is steadily increasing, leading to higher carbon emissions. Predicting energy consumption is a key factor in addressing climate change. There has been a significant shift from traditional statistical models to advanced deep learning (DL) techniques for predicting energy use in buildings. However, data scarcity in newly constructed or poorly instrumented buildings limits the effectiveness of standard DL approaches. In this study, we investigate the application of six data-centric Transfer Learning (TL) strategies on three Transformer architectures—vanilla Transformer, Informer, and PatchTST—to enhance building energy consumption forecasting. Transformers, a relatively new DL framework, have demonstrated significant promise in various domains; yet, prior TL research has often focused on either a single data-centric strategy or older models such as Recurrent Neural Networks. Using 16 diverse datasets from the Building Data Genome Project 2, we conduct an extensive empirical analysis under varying feature spaces (e.g., recorded ambient weather) and building characteristics (e.g., dataset volume). Our experiments show that combining multiple source datasets under a zero-shot setup reduces the Mean Absolute Error (MAE) of the vanilla Transformer model by an average of 15.9 % for 24 h forecasts, compared to single-source baselines. Further fine-tuning these multi-source models with target-domain data yields an additional 3–5 % improvement. Notably, PatchTST outperforms the vanilla Transformer and Informer models. Overall, our results underscore the potential of combining Transformer architectures with TL techniques to enhance building energy consumption forecasting accuracy. However, careful selection of the TL strategy and attention to feature space compatibility are needed to maximize forecasting gains.</div></div>","PeriodicalId":11641,"journal":{"name":"Energy and Buildings","volume":"336 ","pages":"Article 115632"},"PeriodicalIF":6.6000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and Buildings","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378778825003627","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Energy consumption in buildings is steadily increasing, leading to higher carbon emissions. Predicting energy consumption is a key factor in addressing climate change. There has been a significant shift from traditional statistical models to advanced deep learning (DL) techniques for predicting energy use in buildings. However, data scarcity in newly constructed or poorly instrumented buildings limits the effectiveness of standard DL approaches. In this study, we investigate the application of six data-centric Transfer Learning (TL) strategies on three Transformer architectures—vanilla Transformer, Informer, and PatchTST—to enhance building energy consumption forecasting. Transformers, a relatively new DL framework, have demonstrated significant promise in various domains; yet, prior TL research has often focused on either a single data-centric strategy or older models such as Recurrent Neural Networks. Using 16 diverse datasets from the Building Data Genome Project 2, we conduct an extensive empirical analysis under varying feature spaces (e.g., recorded ambient weather) and building characteristics (e.g., dataset volume). Our experiments show that combining multiple source datasets under a zero-shot setup reduces the Mean Absolute Error (MAE) of the vanilla Transformer model by an average of 15.9 % for 24 h forecasts, compared to single-source baselines. Further fine-tuning these multi-source models with target-domain data yields an additional 3–5 % improvement. Notably, PatchTST outperforms the vanilla Transformer and Informer models. Overall, our results underscore the potential of combining Transformer architectures with TL techniques to enhance building energy consumption forecasting accuracy. However, careful selection of the TL strategy and attention to feature space compatibility are needed to maximize forecasting gains.

查看原文本刊更多论文

建筑能耗正在稳步增长，导致碳排放量增加。预测能源消耗是应对气候变化的一个关键因素。在预测建筑物能耗方面，已经出现了从传统统计模型到先进深度学习（DL）技术的重大转变。然而，新建建筑或仪表化程度低的建筑数据稀缺，限制了标准深度学习方法的有效性。在本研究中，我们研究了六种以数据为中心的迁移学习（TL）策略在三种 Transformer 架构（vanilla Transformer、Informer 和 PatchTST）上的应用，以增强建筑能耗预测。变压器是一种相对较新的 DL 框架，已在多个领域展现出显著的前景；然而，之前的 TL 研究往往侧重于以数据为中心的单一策略或循环神经网络等较老的模型。我们利用 "建筑数据基因组计划 2 "的 16 个不同数据集，在不同的特征空间（如记录的环境天气）和建筑特征（如数据集数量）下进行了广泛的实证分析。实验结果表明，与单源基线相比，在零镜头设置下结合多源数据集可将香草变换器模型 24 小时预测的平均绝对误差（MAE）平均降低 15.9%。利用目标域数据对这些多源模型进行进一步微调后，可再提高 3-5%。值得注意的是，PatchTST 的表现优于普通的 Transformer 和 Informer 模型。总之，我们的研究结果凸显了将 Transformer 架构与 TL 技术相结合来提高建筑能耗预测准确性的潜力。不过，要想最大限度地提高预测收益，需要谨慎选择 TL 策略并注意特征空间的兼容性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Energy and Buildings 工程技术-工程：土木

CiteScore

12.70

自引率

11.90%

发文量

863

审稿时长

38 days

期刊介绍： An international journal devoted to investigations of energy use and efficiency in buildings Energy and Buildings is an international journal publishing articles with explicit links to energy use in buildings. The aim is to present new research results, and new proven practice aimed at reducing the energy needs of a building and improving indoor environment quality.