An innovative heterogeneous transfer learning framework to enhance the scalability of deep reinforcement learning controllers in buildings with integrated energy systems

IF 6.1 1区工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY

Building Simulation Pub Date : 2024-02-20 DOI:10.1007/s12273-024-1109-6

Davide Coraci, Silvio Brandi, Tianzhen Hong, Alfonso Capozzoli

{"title":"An innovative heterogeneous transfer learning framework to enhance the scalability of deep reinforcement learning controllers in buildings with integrated energy systems","authors":"Davide Coraci, Silvio Brandi, Tianzhen Hong, Alfonso Capozzoli","doi":"10.1007/s12273-024-1109-6","DOIUrl":null,"url":null,"abstract":"<p>Deep Reinforcement Learning (DRL)-based control shows enhanced performance in the management of integrated energy systems when compared with Rule-Based Controllers (RBCs), but it still lacks scalability and generalisation due to the necessity of using tailored models for the training process. Transfer Learning (TL) is a potential solution to address this limitation. However, existing TL applications in building control have been mostly tested among buildings with similar features, not addressing the need to scale up advanced control in real-world scenarios with diverse energy systems. This paper assesses the performance of an online heterogeneous TL strategy, comparing it with RBC and offline and online DRL controllers in a simulation setup using EnergyPlus and Python. The study tests the transfer in both transductive and inductive settings of a DRL policy designed to manage a chiller coupled with a Thermal Energy Storage (TES). The control policy is pre-trained on a source building and transferred to various target buildings characterised by an integrated energy system including photovoltaic and battery energy storage systems, different building envelope features, occupancy schedule and boundary conditions (e.g., weather and price signal). The TL approach incorporates model slicing, imitation learning and fine-tuning to handle diverse state spaces and reward functions between source and target buildings. Results show that the proposed methodology leads to a reduction of 10% in electricity cost and between 10% and 40% in the mean value of the daily average temperature violation rate compared to RBC and online DRL controllers. Moreover, online TL maximises self-sufficiency and self-consumption by 9% and 11% with respect to RBC. Conversely, online TL achieves worse performance compared to offline DRL in either transductive or inductive settings. However, offline Deep Reinforcement Learning (DRL) agents should be trained at least for 15 episodes to reach the same level of performance as the online TL. Therefore, the proposed online TL methodology is effective, completely model-free and it can be directly implemented in real buildings with satisfying performance.</p>","PeriodicalId":49226,"journal":{"name":"Building Simulation","volume":"19 1","pages":""},"PeriodicalIF":6.1000,"publicationDate":"2024-02-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Building Simulation","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s12273-024-1109-6","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}

引用次数: 0

Abstract

Deep Reinforcement Learning (DRL)-based control shows enhanced performance in the management of integrated energy systems when compared with Rule-Based Controllers (RBCs), but it still lacks scalability and generalisation due to the necessity of using tailored models for the training process. Transfer Learning (TL) is a potential solution to address this limitation. However, existing TL applications in building control have been mostly tested among buildings with similar features, not addressing the need to scale up advanced control in real-world scenarios with diverse energy systems. This paper assesses the performance of an online heterogeneous TL strategy, comparing it with RBC and offline and online DRL controllers in a simulation setup using EnergyPlus and Python. The study tests the transfer in both transductive and inductive settings of a DRL policy designed to manage a chiller coupled with a Thermal Energy Storage (TES). The control policy is pre-trained on a source building and transferred to various target buildings characterised by an integrated energy system including photovoltaic and battery energy storage systems, different building envelope features, occupancy schedule and boundary conditions (e.g., weather and price signal). The TL approach incorporates model slicing, imitation learning and fine-tuning to handle diverse state spaces and reward functions between source and target buildings. Results show that the proposed methodology leads to a reduction of 10% in electricity cost and between 10% and 40% in the mean value of the daily average temperature violation rate compared to RBC and online DRL controllers. Moreover, online TL maximises self-sufficiency and self-consumption by 9% and 11% with respect to RBC. Conversely, online TL achieves worse performance compared to offline DRL in either transductive or inductive settings. However, offline Deep Reinforcement Learning (DRL) agents should be trained at least for 15 episodes to reach the same level of performance as the online TL. Therefore, the proposed online TL methodology is effective, completely model-free and it can be directly implemented in real buildings with satisfying performance.

查看原文本刊更多论文

创新的异构迁移学习框架，提高集成能源系统楼宇中深度强化学习控制器的可扩展性

与基于规则的控制器（RBC）相比，基于深度强化学习（DRL）的控制在综合能源系统管理方面表现出更强的性能，但由于在训练过程中必须使用定制模型，因此仍然缺乏可扩展性和通用性。迁移学习（TL）是解决这一局限性的潜在方案。然而，楼宇控制中现有的迁移学习应用大多是在具有相似特征的楼宇中进行测试，无法满足在具有不同能源系统的现实世界场景中扩展高级控制的需求。本文评估了在线异构 TL 策略的性能，在使用 EnergyPlus 和 Python 的模拟设置中将其与 RBC 以及离线和在线 DRL 控制器进行了比较。该研究测试了 DRL 策略在传导式和感应式设置中的传输情况，该策略旨在管理与热能存储（TES）耦合的冷水机组。该控制策略在源建筑上进行了预训练，并转移到各种目标建筑上，目标建筑的特点是集成能源系统，包括光伏和电池储能系统、不同的建筑围护结构特征、占用时间表和边界条件（如天气和价格信号）。TL 方法结合了模型切分、模仿学习和微调，以处理源建筑和目标建筑之间不同的状态空间和奖励函数。结果表明，与 RBC 和在线 DRL 控制器相比，所提出的方法可使电费降低 10%，日平均温度违规率的平均值降低 10%至 40%。此外，与 RBC 相比，在线 TL 最大限度地提高了自给率和自消耗率，分别提高了 9% 和 11%。相反，与离线 DRL 相比，在线 TL 在传导式或感应式环境下的性能更差。不过，离线深度强化学习（DRL）代理至少要经过 15 次训练，才能达到与在线 TL 相同的性能水平。因此，所提出的在线 TL 方法是有效的，完全不需要模型，而且可以直接在实际建筑中实施，性能令人满意。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Building Simulation THERMODYNAMICS-CONSTRUCTION & BUILDING TECHNOLOGY

CiteScore

10.20

自引率

16.40%

发文量

审稿时长

>12 weeks

期刊介绍： Building Simulation: An International Journal publishes original, high quality, peer-reviewed research papers and review articles dealing with modeling and simulation of buildings including their systems. The goal is to promote the field of building science and technology to such a level that modeling will eventually be used in every aspect of building construction as a routine instead of an exception. Of particular interest are papers that reflect recent developments and applications of modeling tools and their impact on advances of building science and technology.