基于深度强化学习的多变量系统多PID控制器同时整定

IF 3 Q2 ENGINEERING, CHEMICAL
Sammyak Mate, Pawankumar Pal, Anshumali Jaiswal, Sharad Bhartiya
{"title":"基于深度强化学习的多变量系统多PID控制器同时整定","authors":"Sammyak Mate,&nbsp;Pawankumar Pal,&nbsp;Anshumali Jaiswal,&nbsp;Sharad Bhartiya","doi":"10.1016/j.dche.2023.100131","DOIUrl":null,"url":null,"abstract":"<div><p>Traditionally, tuning of PID controllers is based on linear approximation of the dynamics between the manipulated input and the controlled output. The tuning is performed one loop at a time and interaction effects between the multiple single-input-single-output (SISO) feedback control loops is ignored. It is also well-known that if the plant operates over a wide operating range, the dynamic behaviour changes thereby rendering the performance of an initially tuned PID controller unacceptable. The design of PID controllers, in general, is based on linear models that are obtained by linearizing a nonlinear system around a steady state operating point. For example, in peak seeking control, the sign of the process gain changes around the peak value, thereby invalidating the linear model obtained at the other side of the peak. Similarly, at other operating points, the multivariable plant may exhibit new dynamic features such as inverse response. This work proposes to use deep reinforcement learning (DRL) strategies to simultaneously tune multiple SISO PID controllers using a single DRL agent while enforcing interval constraints on the tuning parameter values. This ensures that interaction effects between the loops are directly factored in the tuning. Interval constraints also ensure safety of the plant during training by ensuring that the tuning parameter values are bounded in a stable region. Moreover, a trained agent when deployed, provides operating condition based PID parameters on the fly ensuring nonlinear compensation in the PID design. The methodology is demonstrated on a quadruple tank benchmark system via simulations by simultaneously tuning two PI level controllers. The same methodology is then adopted to tune PI controllers for the operating condition under which the plant exhibits a right half plane multivariable direction zero. Comparisons with PI controllers tuned with standard methods suggest that the proposed method is a viable approach, particularly when simulators are available for the plant dynamics.</p></div>","PeriodicalId":72815,"journal":{"name":"Digital Chemical Engineering","volume":"9 ","pages":"Article 100131"},"PeriodicalIF":3.0000,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772508123000492/pdfft?md5=955833049b05399f7499873d259f2bbe&pid=1-s2.0-S2772508123000492-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Simultaneous tuning of multiple PID controllers for multivariable systems using deep reinforcement learning\",\"authors\":\"Sammyak Mate,&nbsp;Pawankumar Pal,&nbsp;Anshumali Jaiswal,&nbsp;Sharad Bhartiya\",\"doi\":\"10.1016/j.dche.2023.100131\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Traditionally, tuning of PID controllers is based on linear approximation of the dynamics between the manipulated input and the controlled output. The tuning is performed one loop at a time and interaction effects between the multiple single-input-single-output (SISO) feedback control loops is ignored. It is also well-known that if the plant operates over a wide operating range, the dynamic behaviour changes thereby rendering the performance of an initially tuned PID controller unacceptable. The design of PID controllers, in general, is based on linear models that are obtained by linearizing a nonlinear system around a steady state operating point. For example, in peak seeking control, the sign of the process gain changes around the peak value, thereby invalidating the linear model obtained at the other side of the peak. Similarly, at other operating points, the multivariable plant may exhibit new dynamic features such as inverse response. This work proposes to use deep reinforcement learning (DRL) strategies to simultaneously tune multiple SISO PID controllers using a single DRL agent while enforcing interval constraints on the tuning parameter values. This ensures that interaction effects between the loops are directly factored in the tuning. Interval constraints also ensure safety of the plant during training by ensuring that the tuning parameter values are bounded in a stable region. Moreover, a trained agent when deployed, provides operating condition based PID parameters on the fly ensuring nonlinear compensation in the PID design. The methodology is demonstrated on a quadruple tank benchmark system via simulations by simultaneously tuning two PI level controllers. The same methodology is then adopted to tune PI controllers for the operating condition under which the plant exhibits a right half plane multivariable direction zero. Comparisons with PI controllers tuned with standard methods suggest that the proposed method is a viable approach, particularly when simulators are available for the plant dynamics.</p></div>\",\"PeriodicalId\":72815,\"journal\":{\"name\":\"Digital Chemical Engineering\",\"volume\":\"9 \",\"pages\":\"Article 100131\"},\"PeriodicalIF\":3.0000,\"publicationDate\":\"2023-10-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772508123000492/pdfft?md5=955833049b05399f7499873d259f2bbe&pid=1-s2.0-S2772508123000492-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Digital Chemical Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772508123000492\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, CHEMICAL\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Chemical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772508123000492","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}
引用次数: 0

摘要

传统上,PID控制器的调节是基于操纵输入和受控输出之间的动力学的线性近似。调谐一次执行一个环路,并且忽略多个单输入单输出(SISO)反馈控制环路之间的相互作用效应。众所周知,如果工厂在较宽的操作范围内运行,则动态行为会发生变化,从而使最初调整的PID控制器的性能不可接受。PID控制器的设计通常基于线性模型,该模型是通过将非线性系统在稳态操作点附近线性化而获得的。例如,在峰值寻找控制中,过程增益的符号在峰值附近变化,从而使在峰值的另一侧获得的线性模型无效。类似地,在其他操作点,多变量对象可能表现出新的动态特征,例如逆响应。这项工作提出使用深度强化学习(DRL)策略,使用单个DRL代理同时调整多个SISO PID控制器,同时对调整参数值施加区间约束。这确保了在调优中直接考虑循环之间的交互效果。区间约束还通过确保调谐参数值在稳定区域内有界来确保训练期间设备的安全。此外,经过训练的代理在部署时,实时提供基于操作条件的PID参数,确保PID设计中的非线性补偿。通过同时调整两个PI液位控制器的仿真,在四缸基准系统上演示了该方法。然后采用相同的方法来调整PI控制器,以适应设备呈现右半平面多变量方向零点的操作条件。与用标准方法调谐的PI控制器的比较表明,所提出的方法是一种可行的方法,特别是当模拟器可用于工厂动力学时。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Simultaneous tuning of multiple PID controllers for multivariable systems using deep reinforcement learning

Traditionally, tuning of PID controllers is based on linear approximation of the dynamics between the manipulated input and the controlled output. The tuning is performed one loop at a time and interaction effects between the multiple single-input-single-output (SISO) feedback control loops is ignored. It is also well-known that if the plant operates over a wide operating range, the dynamic behaviour changes thereby rendering the performance of an initially tuned PID controller unacceptable. The design of PID controllers, in general, is based on linear models that are obtained by linearizing a nonlinear system around a steady state operating point. For example, in peak seeking control, the sign of the process gain changes around the peak value, thereby invalidating the linear model obtained at the other side of the peak. Similarly, at other operating points, the multivariable plant may exhibit new dynamic features such as inverse response. This work proposes to use deep reinforcement learning (DRL) strategies to simultaneously tune multiple SISO PID controllers using a single DRL agent while enforcing interval constraints on the tuning parameter values. This ensures that interaction effects between the loops are directly factored in the tuning. Interval constraints also ensure safety of the plant during training by ensuring that the tuning parameter values are bounded in a stable region. Moreover, a trained agent when deployed, provides operating condition based PID parameters on the fly ensuring nonlinear compensation in the PID design. The methodology is demonstrated on a quadruple tank benchmark system via simulations by simultaneously tuning two PI level controllers. The same methodology is then adopted to tune PI controllers for the operating condition under which the plant exhibits a right half plane multivariable direction zero. Comparisons with PI controllers tuned with standard methods suggest that the proposed method is a viable approach, particularly when simulators are available for the plant dynamics.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
3.10
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信