Deep reinforcement learning-based control of thermal energy storage for university classrooms: Co-Simulation with TRNSYS-Python and transfer learning across operational scenarios
Giacomo Buscemi , Francesco Paolo Cuomo , Giuseppe Razzano , Francesco Liberato Cappiello , Silvio Brandi
{"title":"Deep reinforcement learning-based control of thermal energy storage for university classrooms: Co-Simulation with TRNSYS-Python and transfer learning across operational scenarios","authors":"Giacomo Buscemi , Francesco Paolo Cuomo , Giuseppe Razzano , Francesco Liberato Cappiello , Silvio Brandi","doi":"10.1016/j.egyr.2025.07.003","DOIUrl":null,"url":null,"abstract":"<div><div>Advanced controllers leveraging predictive and adaptive methods play a crucial role in optimizing building energy management and enhancing flexibility for maximizing the use of on-site renewable energy sources. This study investigates the application of Deep Reinforcement Learning (DRL) algorithms for optimizing the operation of an Air Handling Unit (AHU) system coupled with an inertial Thermal Energy Storage (TES) tank, serving educational spaces at the <em>Politecnico di Torino</em> campus. The TES supports both cooling in summer and heating in winter through the AHU’s thermal exchange coils, while proportional control mechanisms regulate downstream air conditions to ensure indoor comfort. The building’s electric energy demand is partially met by a 47 kW photovoltaic (PV) field and by power grid.</div><div>To evaluate the performance of DRL controllers, a co-simulation framework integrating TRNSYS 18 with Python was developed. This setup enables a comparative assessment between a baseline Rule-Based (RB) control approach, and a Soft Actor-Critic (SAC) Reinforcement Learning (RL) controller. The DRL controllers are trained to minimize heat pump (HP) energy consumption, strategically aligning operations with PV availability and electricity price fluctuations, while enforcing safety constraints to prevent temperature violations in the TES.</div><div>Simulation results demonstrate that the DRL approach achieved a 4.73 MWh reduction in annual primary energy consumption and a 3.2% decrease in operating costs compared to RB control, with electricity cost savings reaching 5.8%. To evaluate the controller’s generalizability, zero-shot transfer learning was employed to deploy the pre-trained DRL agent across different climatic conditions, tariff structures, and system fault scenarios without retraining. In comparison with the RB control, the transfer learning results demonstrated high adaptability, with electricity cost reductions ranging from 11.2% to 24.5% and gas consumption savings between 7.1% and 69.7% across diverse operating conditions.</div></div>","PeriodicalId":11798,"journal":{"name":"Energy Reports","volume":"14 ","pages":"Pages 1349-1367"},"PeriodicalIF":5.1000,"publicationDate":"2025-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy Reports","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352484725004172","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0
Abstract
Advanced controllers leveraging predictive and adaptive methods play a crucial role in optimizing building energy management and enhancing flexibility for maximizing the use of on-site renewable energy sources. This study investigates the application of Deep Reinforcement Learning (DRL) algorithms for optimizing the operation of an Air Handling Unit (AHU) system coupled with an inertial Thermal Energy Storage (TES) tank, serving educational spaces at the Politecnico di Torino campus. The TES supports both cooling in summer and heating in winter through the AHU’s thermal exchange coils, while proportional control mechanisms regulate downstream air conditions to ensure indoor comfort. The building’s electric energy demand is partially met by a 47 kW photovoltaic (PV) field and by power grid.
To evaluate the performance of DRL controllers, a co-simulation framework integrating TRNSYS 18 with Python was developed. This setup enables a comparative assessment between a baseline Rule-Based (RB) control approach, and a Soft Actor-Critic (SAC) Reinforcement Learning (RL) controller. The DRL controllers are trained to minimize heat pump (HP) energy consumption, strategically aligning operations with PV availability and electricity price fluctuations, while enforcing safety constraints to prevent temperature violations in the TES.
Simulation results demonstrate that the DRL approach achieved a 4.73 MWh reduction in annual primary energy consumption and a 3.2% decrease in operating costs compared to RB control, with electricity cost savings reaching 5.8%. To evaluate the controller’s generalizability, zero-shot transfer learning was employed to deploy the pre-trained DRL agent across different climatic conditions, tariff structures, and system fault scenarios without retraining. In comparison with the RB control, the transfer learning results demonstrated high adaptability, with electricity cost reductions ranging from 11.2% to 24.5% and gas consumption savings between 7.1% and 69.7% across diverse operating conditions.
期刊介绍:
Energy Reports is a new online multidisciplinary open access journal which focuses on publishing new research in the area of Energy with a rapid review and publication time. Energy Reports will be open to direct submissions and also to submissions from other Elsevier Energy journals, whose Editors have determined that Energy Reports would be a better fit.