具有输入输出约束的非线性系统的实用强化学习控制设计

IF 3.9 2区 工程技术 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Hesam Hassanpour , Brandon Corbett , Prashant Mhaskar
{"title":"具有输入输出约束的非线性系统的实用强化学习控制设计","authors":"Hesam Hassanpour ,&nbsp;Brandon Corbett ,&nbsp;Prashant Mhaskar","doi":"10.1016/j.compchemeng.2025.109248","DOIUrl":null,"url":null,"abstract":"<div><div>In this work, a practically implementable reinforcement learning (RL)-based controller is designed to handle process input and output constraints. In a typical RL problem, an RL agent is employed to learn an optimal control policy through interactions with the environment. This is unimplementable in practical situations due to the excessive exploration needed by the RL-based controller and exacerbated by the possible violation of the input and output constraints. We previously proposed an implementable RL controller that can circumvent random exploration needs by leveraging existing model predictive control (MPC) to pre-train/warm start the RL agent. The pre-trained agent is subsequently employed in real-time to engage with the process to improve its performance by gaining more knowledge about the nonlinear behavior of the system. This work generalizes our previous method to handle constraints on the outputs and the rate of change of the inputs by modifying the reward function. The effectiveness of the proposed algorithm is illustrated through simulations conducted for control of a pH neutralization process. The findings indicate that the proposed RL method enhances closed-loop performance in comparison to the nominal MPC while satisfying all input and output constraints.</div></div>","PeriodicalId":286,"journal":{"name":"Computers & Chemical Engineering","volume":"201 ","pages":"Article 109248"},"PeriodicalIF":3.9000,"publicationDate":"2025-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A practical reinforcement learning control design for nonlinear systems with input and output constraints\",\"authors\":\"Hesam Hassanpour ,&nbsp;Brandon Corbett ,&nbsp;Prashant Mhaskar\",\"doi\":\"10.1016/j.compchemeng.2025.109248\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>In this work, a practically implementable reinforcement learning (RL)-based controller is designed to handle process input and output constraints. In a typical RL problem, an RL agent is employed to learn an optimal control policy through interactions with the environment. This is unimplementable in practical situations due to the excessive exploration needed by the RL-based controller and exacerbated by the possible violation of the input and output constraints. We previously proposed an implementable RL controller that can circumvent random exploration needs by leveraging existing model predictive control (MPC) to pre-train/warm start the RL agent. The pre-trained agent is subsequently employed in real-time to engage with the process to improve its performance by gaining more knowledge about the nonlinear behavior of the system. This work generalizes our previous method to handle constraints on the outputs and the rate of change of the inputs by modifying the reward function. The effectiveness of the proposed algorithm is illustrated through simulations conducted for control of a pH neutralization process. The findings indicate that the proposed RL method enhances closed-loop performance in comparison to the nominal MPC while satisfying all input and output constraints.</div></div>\",\"PeriodicalId\":286,\"journal\":{\"name\":\"Computers & Chemical Engineering\",\"volume\":\"201 \",\"pages\":\"Article 109248\"},\"PeriodicalIF\":3.9000,\"publicationDate\":\"2025-07-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computers & Chemical Engineering\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0098135425002522\",\"RegionNum\":2,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers & Chemical Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0098135425002522","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

摘要

在这项工作中,设计了一个实际可实现的基于强化学习(RL)的控制器来处理过程输入和输出约束。在典型的RL问题中,RL代理通过与环境的交互来学习最优控制策略。这在实际情况下是无法实现的,因为基于rl的控制器需要过度探索,并且可能违反输入和输出约束。我们之前提出了一个可实现的RL控制器,它可以通过利用现有的模型预测控制(MPC)来预训练/预热启动RL代理,从而规避随机探索需求。预训练的智能体随后被实时地应用于过程中,通过获得更多关于系统非线性行为的知识来提高其性能。这项工作推广了我们之前的方法,通过修改奖励函数来处理输出约束和输入变化率。通过对pH中和过程控制的仿真,说明了该算法的有效性。研究结果表明,与标称MPC相比,所提出的RL方法在满足所有输入和输出约束的同时提高了闭环性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A practical reinforcement learning control design for nonlinear systems with input and output constraints
In this work, a practically implementable reinforcement learning (RL)-based controller is designed to handle process input and output constraints. In a typical RL problem, an RL agent is employed to learn an optimal control policy through interactions with the environment. This is unimplementable in practical situations due to the excessive exploration needed by the RL-based controller and exacerbated by the possible violation of the input and output constraints. We previously proposed an implementable RL controller that can circumvent random exploration needs by leveraging existing model predictive control (MPC) to pre-train/warm start the RL agent. The pre-trained agent is subsequently employed in real-time to engage with the process to improve its performance by gaining more knowledge about the nonlinear behavior of the system. This work generalizes our previous method to handle constraints on the outputs and the rate of change of the inputs by modifying the reward function. The effectiveness of the proposed algorithm is illustrated through simulations conducted for control of a pH neutralization process. The findings indicate that the proposed RL method enhances closed-loop performance in comparison to the nominal MPC while satisfying all input and output constraints.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Computers & Chemical Engineering
Computers & Chemical Engineering 工程技术-工程:化工
CiteScore
8.70
自引率
14.00%
发文量
374
审稿时长
70 days
期刊介绍: Computers & Chemical Engineering is primarily a journal of record for new developments in the application of computing and systems technology to chemical engineering problems.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信