Improving HVAC control with transfer learning: Using padding techniques for cross-building pre-training and fine-tuning

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Energy and AI Pub Date : 2025-06-11 DOI:10.1016/j.egyai.2025.100531

Kevlyn Kadamala, Des Chambers, Enda Barrett

{"title":"Improving HVAC control with transfer learning: Using padding techniques for cross-building pre-training and fine-tuning","authors":"Kevlyn Kadamala, Des Chambers, Enda Barrett","doi":"10.1016/j.egyai.2025.100531","DOIUrl":null,"url":null,"abstract":"<div><div>Recent advancements have shown that control strategies using Deep Reinforcement Learning (DRL) can significantly improve the management of HVAC control and energy systems in buildings, leading to significant energy savings and better comfort. Unlike conventional rule-based controllers, they demand considerable time and data to develop effective policies. Transfer learning using pre-trained models can help address this issue. In this work, we use imitation learning (IL) as a method of pre-training and reinforcement learning (RL) for fine-tuning. However, HVAC systems can vary depending on the location, building size, structure, construction materials and weather conditions. The diversity in HVAC control systems across different buildings complicates the use of IL and RL. Neural network weights trained on the source building cannot be directly transferred to the target building because of differences in input features and the number of control equipment. To overcome this problem, we propose a novel padding method to ensure that both the source and target buildings share the same state space dimensionality. Thus, the trained neural network weights are transferable, and only the output layer must be adjusted to fit the dimensionality of the target action space. Additionally, we evaluate the performance of an existing padding technique for comparison. Our experiments show that the novel padding technique outperforms zero padding by 1.37% and training from scratch by 4.59% on average.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"21 ","pages":"Article 100531"},"PeriodicalIF":9.6000,"publicationDate":"2025-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825000631","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements have shown that control strategies using Deep Reinforcement Learning (DRL) can significantly improve the management of HVAC control and energy systems in buildings, leading to significant energy savings and better comfort. Unlike conventional rule-based controllers, they demand considerable time and data to develop effective policies. Transfer learning using pre-trained models can help address this issue. In this work, we use imitation learning (IL) as a method of pre-training and reinforcement learning (RL) for fine-tuning. However, HVAC systems can vary depending on the location, building size, structure, construction materials and weather conditions. The diversity in HVAC control systems across different buildings complicates the use of IL and RL. Neural network weights trained on the source building cannot be directly transferred to the target building because of differences in input features and the number of control equipment. To overcome this problem, we propose a novel padding method to ensure that both the source and target buildings share the same state space dimensionality. Thus, the trained neural network weights are transferable, and only the output layer must be adjusted to fit the dimensionality of the target action space. Additionally, we evaluate the performance of an existing padding technique for comparison. Our experiments show that the novel padding technique outperforms zero padding by 1.37% and training from scratch by 4.59% on average.

Abstract Image

查看原文本刊更多论文

用迁移学习改进暖通空调控制：使用填充技术进行交叉建筑预训练和微调

最近的进展表明，使用深度强化学习（DRL）的控制策略可以显着改善建筑物中暖通空调控制和能源系统的管理，从而显着节省能源和提高舒适度。与传统的基于规则的控制器不同，它们需要大量的时间和数据来制定有效的策略。使用预训练模型的迁移学习可以帮助解决这个问题。在这项工作中，我们使用模仿学习（IL）作为预训练和强化学习（RL）的方法进行微调。然而，HVAC系统可以根据位置，建筑大小，结构，建筑材料和天气条件而变化。不同建筑的暖通空调控制系统的多样性使IL和RL的使用复杂化。由于输入特征和控制设备数量的不同，在源建筑物上训练的神经网络权值不能直接传递到目标建筑物。为了克服这一问题，我们提出了一种新的填充方法，以确保源和目标建筑物具有相同的状态空间维数。因此，训练的神经网络权值是可转移的，只需要调整输出层以适应目标动作空间的维数。此外，我们评估了现有填充技术的性能进行比较。我们的实验表明，新填充技术比零填充技术平均高出1.37%，比从头开始训练平均高出4.59%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊