Robert Spencer , Surangika Ranathunga , Mikael Boulic , Andries (Hennie) van Heerden , Teo Susnjak
{"title":"Transfer learning on transformers for building energy consumption forecasting—A comparative study","authors":"Robert Spencer , Surangika Ranathunga , Mikael Boulic , Andries (Hennie) van Heerden , Teo Susnjak","doi":"10.1016/j.enbuild.2025.115632","DOIUrl":null,"url":null,"abstract":"<div><div>Energy consumption in buildings is steadily increasing, leading to higher carbon emissions. Predicting energy consumption is a key factor in addressing climate change. There has been a significant shift from traditional statistical models to advanced deep learning (DL) techniques for predicting energy use in buildings. However, data scarcity in newly constructed or poorly instrumented buildings limits the effectiveness of standard DL approaches. In this study, we investigate the application of six data-centric Transfer Learning (TL) strategies on three Transformer architectures—vanilla Transformer, Informer, and PatchTST—to enhance building energy consumption forecasting. Transformers, a relatively new DL framework, have demonstrated significant promise in various domains; yet, prior TL research has often focused on either a single data-centric strategy or older models such as Recurrent Neural Networks. Using 16 diverse datasets from the Building Data Genome Project 2, we conduct an extensive empirical analysis under varying feature spaces (e.g., recorded ambient weather) and building characteristics (e.g., dataset volume). Our experiments show that combining multiple source datasets under a zero-shot setup reduces the Mean Absolute Error (MAE) of the vanilla Transformer model by an average of 15.9 % for 24 h forecasts, compared to single-source baselines. Further fine-tuning these multi-source models with target-domain data yields an additional 3–5 % improvement. Notably, PatchTST outperforms the vanilla Transformer and Informer models. Overall, our results underscore the potential of combining Transformer architectures with TL techniques to enhance building energy consumption forecasting accuracy. However, careful selection of the TL strategy and attention to feature space compatibility are needed to maximize forecasting gains.</div></div>","PeriodicalId":11641,"journal":{"name":"Energy and Buildings","volume":"336 ","pages":"Article 115632"},"PeriodicalIF":6.6000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and Buildings","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0378778825003627","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Energy consumption in buildings is steadily increasing, leading to higher carbon emissions. Predicting energy consumption is a key factor in addressing climate change. There has been a significant shift from traditional statistical models to advanced deep learning (DL) techniques for predicting energy use in buildings. However, data scarcity in newly constructed or poorly instrumented buildings limits the effectiveness of standard DL approaches. In this study, we investigate the application of six data-centric Transfer Learning (TL) strategies on three Transformer architectures—vanilla Transformer, Informer, and PatchTST—to enhance building energy consumption forecasting. Transformers, a relatively new DL framework, have demonstrated significant promise in various domains; yet, prior TL research has often focused on either a single data-centric strategy or older models such as Recurrent Neural Networks. Using 16 diverse datasets from the Building Data Genome Project 2, we conduct an extensive empirical analysis under varying feature spaces (e.g., recorded ambient weather) and building characteristics (e.g., dataset volume). Our experiments show that combining multiple source datasets under a zero-shot setup reduces the Mean Absolute Error (MAE) of the vanilla Transformer model by an average of 15.9 % for 24 h forecasts, compared to single-source baselines. Further fine-tuning these multi-source models with target-domain data yields an additional 3–5 % improvement. Notably, PatchTST outperforms the vanilla Transformer and Informer models. Overall, our results underscore the potential of combining Transformer architectures with TL techniques to enhance building energy consumption forecasting accuracy. However, careful selection of the TL strategy and attention to feature space compatibility are needed to maximize forecasting gains.
期刊介绍:
An international journal devoted to investigations of energy use and efficiency in buildings
Energy and Buildings is an international journal publishing articles with explicit links to energy use in buildings. The aim is to present new research results, and new proven practice aimed at reducing the energy needs of a building and improving indoor environment quality.