A trustworthy reinforcement learning framework for autonomous control of a large-scale complex heating system: Simulation and field implementation

IF 10.1 1区 工程技术 Q1 ENERGY & FUELS
Amirreza Heidari , Luc Girardin , Cédric Dorsaz , François Maréchal
{"title":"A trustworthy reinforcement learning framework for autonomous control of a large-scale complex heating system: Simulation and field implementation","authors":"Amirreza Heidari ,&nbsp;Luc Girardin ,&nbsp;Cédric Dorsaz ,&nbsp;François Maréchal","doi":"10.1016/j.apenergy.2024.124815","DOIUrl":null,"url":null,"abstract":"<div><div>Traditional control approaches heavily rely on hard-coded expert knowledge, complicating the development of optimal control solutions as system complexity increases. Deep Reinforcement Learning (DRL) offers a self-learning control solution, proving advantageous in scenarios where crafting expert-based solutions becomes intricate. This study investigates the potential of DRL for supervisory control in a unique and complex heating system within a large-scale university building. The DRL framework aims to minimize energy costs while ensuring occupant comfort. However, the trial-and-error learning approach of DRL raises concerns about the trustworthiness of executed actions, hindering practical implementation. To address this, the study incorporates action masking, enabling the integration of hard constraints into DRL to enhance user trust. Maskable Proximal Policy Optimization (MPPO) is evaluated alongside standard Proximal Policy Optimization (PPO) and Soft Actor–Critic (SAC). Simulation results reveal that MPPO achieves comparable energy savings (8% relative to the baseline control) with fewer comfort violations than other methods. Therefore, it is selected among the candidate algorithms and experimentally implemented in the university building over one week. Experimental findings demonstrate that MPPO reduces energy costs while maintaining occupant comfort, resulting in a 36% saving compared to a historical day with similar weather conditions. These results underscore the proactive decision-making capability of DRL, establishing its viability for autonomous control in complex energy systems.</div></div>","PeriodicalId":246,"journal":{"name":"Applied Energy","volume":"378 ","pages":"Article 124815"},"PeriodicalIF":10.1000,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Energy","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0306261924021986","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENERGY & FUELS","Score":null,"Total":0}
引用次数: 0

Abstract

Traditional control approaches heavily rely on hard-coded expert knowledge, complicating the development of optimal control solutions as system complexity increases. Deep Reinforcement Learning (DRL) offers a self-learning control solution, proving advantageous in scenarios where crafting expert-based solutions becomes intricate. This study investigates the potential of DRL for supervisory control in a unique and complex heating system within a large-scale university building. The DRL framework aims to minimize energy costs while ensuring occupant comfort. However, the trial-and-error learning approach of DRL raises concerns about the trustworthiness of executed actions, hindering practical implementation. To address this, the study incorporates action masking, enabling the integration of hard constraints into DRL to enhance user trust. Maskable Proximal Policy Optimization (MPPO) is evaluated alongside standard Proximal Policy Optimization (PPO) and Soft Actor–Critic (SAC). Simulation results reveal that MPPO achieves comparable energy savings (8% relative to the baseline control) with fewer comfort violations than other methods. Therefore, it is selected among the candidate algorithms and experimentally implemented in the university building over one week. Experimental findings demonstrate that MPPO reduces energy costs while maintaining occupant comfort, resulting in a 36% saving compared to a historical day with similar weather conditions. These results underscore the proactive decision-making capability of DRL, establishing its viability for autonomous control in complex energy systems.
用于大规模复杂供热系统自主控制的可信强化学习框架:模拟和实地实施
传统的控制方法严重依赖于硬编码的专家知识,随着系统复杂性的增加,优化控制解决方案的开发也变得更加复杂。深度强化学习(DRL)提供了一种自学习控制解决方案,在基于专家的解决方案变得复杂的情况下证明了其优势。本研究调查了 DRL 在大型大学建筑内独特而复杂的供热系统中用于监督控制的潜力。DRL 框架旨在最大限度地降低能源成本,同时确保居住者的舒适度。然而,DRL 的试错学习方法引起了人们对所执行操作的可信度的担忧,从而阻碍了实际应用。为解决这一问题,本研究采用了行动掩码技术,将硬约束整合到 DRL 中,以增强用户信任度。可屏蔽近端策略优化(MPPO)与标准近端策略优化(PPO)和软行为批判(SAC)一起进行了评估。仿真结果表明,与其他方法相比,MPPO 实现了相当的节能效果(相对于基线控制为 8%),且违反舒适度的情况较少。因此,我们在候选算法中选择了 MPPO,并在大学建筑中进行了为期一周的实验。实验结果表明,MPPO 降低了能源成本,同时保持了居住舒适度,与天气条件相似的历史天数相比,节省了 36%。这些结果凸显了 DRL 的前瞻性决策能力,确立了其在复杂能源系统中进行自主控制的可行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Applied Energy
Applied Energy 工程技术-工程:化工
CiteScore
21.20
自引率
10.70%
发文量
1830
审稿时长
41 days
期刊介绍: Applied Energy serves as a platform for sharing innovations, research, development, and demonstrations in energy conversion, conservation, and sustainable energy systems. The journal covers topics such as optimal energy resource use, environmental pollutant mitigation, and energy process analysis. It welcomes original papers, review articles, technical notes, and letters to the editor. Authors are encouraged to submit manuscripts that bridge the gap between research, development, and implementation. The journal addresses a wide spectrum of topics, including fossil and renewable energy technologies, energy economics, and environmental impacts. Applied Energy also explores modeling and forecasting, conservation strategies, and the social and economic implications of energy policies, including climate change mitigation. It is complemented by the open-access journal Advances in Applied Energy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信