Optimizing radiant floor heating control with night setbacks using model-free reinforcement learning and transfer learning

IF 7.6 1区工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY

Building and Environment Pub Date : 2025-09-26 DOI:10.1016/j.buildenv.2025.113771

Xu Han , Ali Malkawi , Zhuorui Li , Runyu Zhang , Na Li

{"title":"Optimizing radiant floor heating control with night setbacks using model-free reinforcement learning and transfer learning","authors":"Xu Han , Ali Malkawi , Zhuorui Li , Runyu Zhang , Na Li","doi":"10.1016/j.buildenv.2025.113771","DOIUrl":null,"url":null,"abstract":"<div><div>Controlling Radiant Floor Heating (RFH) systems with night setbacks presents a challenge due to their slow-response dynamics. Model Predictive Control (MPC) has demonstrated effectiveness in controlling such systems, but the need for model development constrains its scalability. This study investigates the feasibility and approaches of using model-free Reinforcement Learning (RL) and transfer learning for optimal control of RFH systems with night setbacks. A physics-based model is developed and validated as a virtual testbed. Four distinct RL control (RLC) strategies are proposed and evaluated, alongside a conventional Rule-Based Control (RBC) strategy as a baseline, and an MPC as an upper-bound performance benchmark. Our findings reveal that the Deep Q-Network (DQN) with n-step Temporal Difference learning incorporating weather forecasts as <em>states</em> achieve the best performance. The heating demand is reduced by 15-23 % with RLC and 13.1-28.5 % with MPC against RBC. However, the unmet hours with RLC are higher than those of MPC, suggesting further research for constraint satisfaction improvement. The transferability of the RLC is also evaluated by applying the trained RL agent to a new building using transfer learning through weights initialization, layer freezing and fine tuning. The results show that the training time for a new agent with a target building is significantly reduced taking advantage of transfer learning from an existing agent trained with a source building. In conclusion, this study demonstrates the potential of model-free RL and transfer learning in optimizing RFH systems with night setbacks, fostering advancements in scalable optimal building control strategies.</div></div>","PeriodicalId":9273,"journal":{"name":"Building and Environment","volume":"287 ","pages":"Article 113771"},"PeriodicalIF":7.6000,"publicationDate":"2025-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Building and Environment","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0360132325012417","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Controlling Radiant Floor Heating (RFH) systems with night setbacks presents a challenge due to their slow-response dynamics. Model Predictive Control (MPC) has demonstrated effectiveness in controlling such systems, but the need for model development constrains its scalability. This study investigates the feasibility and approaches of using model-free Reinforcement Learning (RL) and transfer learning for optimal control of RFH systems with night setbacks. A physics-based model is developed and validated as a virtual testbed. Four distinct RL control (RLC) strategies are proposed and evaluated, alongside a conventional Rule-Based Control (RBC) strategy as a baseline, and an MPC as an upper-bound performance benchmark. Our findings reveal that the Deep Q-Network (DQN) with n-step Temporal Difference learning incorporating weather forecasts as states achieve the best performance. The heating demand is reduced by 15-23 % with RLC and 13.1-28.5 % with MPC against RBC. However, the unmet hours with RLC are higher than those of MPC, suggesting further research for constraint satisfaction improvement. The transferability of the RLC is also evaluated by applying the trained RL agent to a new building using transfer learning through weights initialization, layer freezing and fine tuning. The results show that the training time for a new agent with a target building is significantly reduced taking advantage of transfer learning from an existing agent trained with a source building. In conclusion, this study demonstrates the potential of model-free RL and transfer learning in optimizing RFH systems with night setbacks, fostering advancements in scalable optimal building control strategies.

查看原文本刊更多论文

利用无模型强化学习和迁移学习优化地板辐射采暖控制

由于地面辐射采暖（RFH）系统的响应速度慢，对其进行控制是一项挑战。模型预测控制（MPC）在控制此类系统方面已被证明是有效的，但对模型开发的需求限制了其可扩展性。本研究探讨了将无模型强化学习（RL）和迁移学习用于具有夜间挫折的RFH系统最优控制的可行性和方法。开发了一个基于物理的模型，并作为虚拟测试平台进行了验证。提出并评估了四种不同的RL控制（RLC）策略，其中传统的基于规则的控制（RBC）策略作为基准，MPC作为上限性能基准。我们的研究结果表明，将天气预报作为状态的n步时间差分学习的Deep Q-Network （DQN）达到了最佳性能。与RBC相比，RLC减少了15- 23%的供暖需求，MPC减少了13.1- 28.5%。然而，RLC的未满足时数高于MPC的未满足时数，表明约束满意度的改善有待进一步研究。通过权重初始化、层冻结和微调的迁移学习，将训练好的RL代理应用于新建筑，评估了RLC的可转移性。结果表明，利用从已有智能体的源构建训练中迁移学习的优势，具有目标构建的新智能体的训练时间显著减少。总之，本研究证明了无模型强化学习和迁移学习在优化RFH系统中的潜力，促进了可扩展的最优建筑控制策略的进步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Building and Environment 工程技术-工程：环境

CiteScore

12.50

自引率

23.00%

发文量

1130

审稿时长

27 days

期刊介绍： Building and Environment, an international journal, is dedicated to publishing original research papers, comprehensive review articles, editorials, and short communications in the fields of building science, urban physics, and human interaction with the indoor and outdoor built environment. The journal emphasizes innovative technologies and knowledge verified through measurement and analysis. It covers environmental performance across various spatial scales, from cities and communities to buildings and systems, fostering collaborative, multi-disciplinary research with broader significance.