Advanced day-ahead scheduling of HVAC demand response control using novel strategy of Q-learning, model predictive control, and input convex neural networks

IF 9.6 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE
Rahman Heidarykiany, Cristinel Ababei
{"title":"Advanced day-ahead scheduling of HVAC demand response control using novel strategy of Q-learning, model predictive control, and input convex neural networks","authors":"Rahman Heidarykiany,&nbsp;Cristinel Ababei","doi":"10.1016/j.egyai.2025.100509","DOIUrl":null,"url":null,"abstract":"<div><div>In this paper, we present a Q-Learning optimization algorithm for smart home HVAC systems. The proposed algorithm combines new convex deep neural network models with model predictive control (MPC) techniques. More specifically, new input convex long short-term memory (ICLSTM) models are employed to predict dynamic states in an MPC optimal control technique integrated within a Q-Learning reinforcement learning (RL) algorithm to further improve the learned temporal behaviors of nonlinear HVAC systems. As a novel RL approach, the proposed algorithm generates day-ahead HVAC demand response (DR) signals in smart homes that optimally reduce and/or shift peak energy usage, reduce electricity costs, minimize user discomfort, and honor in a best-effort way the recommendations from utility/aggregator, which in turn has impact on the overall well being of the distribution network controlled by the aggregator. The proposed Q-Learning optimization algorithm, based on epsilon-model predictive control (<span><math><mi>ϵ</mi></math></span>-MPC), can be implemented as a control agent that is executed by the smart house energy management (SHEM) system that we assume exists in the smart home, which can interact with the energy provider of the distribution network, i.e., utility/aggregator, via the smart meter. The output generated by the proposed control agent represents day-ahead local DR signals in the form of temperature setpoints for the HVAC system that are found by the optimization process to lead to desired trade-offs between electricity cost and user discomfort. The proposed algorithm can be used in smart homes with passive HVAC controllers, which solely react to end-user setpoints, to transform them into smart homes with active HVAC controllers. Such systems not only respond to the preferences of the end-user but also incorporate an external control signal provided by the utility or aggregator. Simulation experiments conducted with a custom simulation tool demonstrate that the proposed optimization framework can offer significant benefits. It achieves 87% higher success rate in optimizing setpoints in the desired range, thereby resulting in up to 15% energy savings and zero temperature discomfort.</div></div>","PeriodicalId":34138,"journal":{"name":"Energy and AI","volume":"20 ","pages":"Article 100509"},"PeriodicalIF":9.6000,"publicationDate":"2025-04-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Energy and AI","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666546825000412","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

In this paper, we present a Q-Learning optimization algorithm for smart home HVAC systems. The proposed algorithm combines new convex deep neural network models with model predictive control (MPC) techniques. More specifically, new input convex long short-term memory (ICLSTM) models are employed to predict dynamic states in an MPC optimal control technique integrated within a Q-Learning reinforcement learning (RL) algorithm to further improve the learned temporal behaviors of nonlinear HVAC systems. As a novel RL approach, the proposed algorithm generates day-ahead HVAC demand response (DR) signals in smart homes that optimally reduce and/or shift peak energy usage, reduce electricity costs, minimize user discomfort, and honor in a best-effort way the recommendations from utility/aggregator, which in turn has impact on the overall well being of the distribution network controlled by the aggregator. The proposed Q-Learning optimization algorithm, based on epsilon-model predictive control (ϵ-MPC), can be implemented as a control agent that is executed by the smart house energy management (SHEM) system that we assume exists in the smart home, which can interact with the energy provider of the distribution network, i.e., utility/aggregator, via the smart meter. The output generated by the proposed control agent represents day-ahead local DR signals in the form of temperature setpoints for the HVAC system that are found by the optimization process to lead to desired trade-offs between electricity cost and user discomfort. The proposed algorithm can be used in smart homes with passive HVAC controllers, which solely react to end-user setpoints, to transform them into smart homes with active HVAC controllers. Such systems not only respond to the preferences of the end-user but also incorporate an external control signal provided by the utility or aggregator. Simulation experiments conducted with a custom simulation tool demonstrate that the proposed optimization framework can offer significant benefits. It achieves 87% higher success rate in optimizing setpoints in the desired range, thereby resulting in up to 15% energy savings and zero temperature discomfort.

Abstract Image

基于q -学习、模型预测控制和输入凸神经网络的HVAC需求响应控制超前调度
本文提出了一种适用于智能家居暖通空调系统的Q-Learning优化算法。该算法将新的凸深度神经网络模型与模型预测控制(MPC)技术相结合。更具体地说,采用新的输入凸长短期记忆(ICLSTM)模型来预测MPC最优控制技术中的动态状态,该技术与Q-Learning强化学习(RL)算法相结合,以进一步改善非线性暖通空调系统的学习时间行为。作为一种新颖的RL方法,所提出的算法在智能家居中生成日前HVAC需求响应(DR)信号,以最佳方式减少和/或转移峰值能源使用,降低电力成本,最大限度地减少用户不适,并以最大努力的方式尊重公用事业/聚合器的建议,这反过来又对聚合器控制的配电网络的整体健康产生影响。提出的基于epsilon模型预测控制(ϵ-MPC)的Q-Learning优化算法可以作为控制代理实现,由我们假设存在于智能家居中的智能家居能源管理(SHEM)系统执行,该系统可以通过智能电表与配电网的能源提供商(即公用事业/聚合器)进行交互。所提出的控制代理产生的输出以HVAC系统温度设定值的形式表示前一天的本地DR信号,这些信号是通过优化过程找到的,从而在电力成本和用户不适之间进行期望的权衡。所提出的算法可用于具有被动HVAC控制器的智能家居,该控制器仅对最终用户设定值做出反应,将其转换为具有主动HVAC控制器的智能家居。这种系统不仅响应最终用户的偏好,而且还包含由公用事业或聚合器提供的外部控制信号。使用自定义仿真工具进行的仿真实验表明,所提出的优化框架可以提供显着的好处。它在期望范围内优化设定值的成功率提高了87%,从而节省了15%的能源和零温度不适。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Energy and AI
Energy and AI Engineering-Engineering (miscellaneous)
CiteScore
16.50
自引率
0.00%
发文量
64
审稿时长
56 days
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信