Simultaneous tuning of multiple PID controllers for multivariable systems using deep reinforcement learning

IF 3 Q2 ENGINEERING, CHEMICAL

Digital Chemical Engineering Pub Date : 2023-10-20 DOI:10.1016/j.dche.2023.100131

Sammyak Mate, Pawankumar Pal, Anshumali Jaiswal, Sharad Bhartiya

{"title":"Simultaneous tuning of multiple PID controllers for multivariable systems using deep reinforcement learning","authors":"Sammyak Mate, Pawankumar Pal, Anshumali Jaiswal, Sharad Bhartiya","doi":"10.1016/j.dche.2023.100131","DOIUrl":null,"url":null,"abstract":"<div><p>Traditionally, tuning of PID controllers is based on linear approximation of the dynamics between the manipulated input and the controlled output. The tuning is performed one loop at a time and interaction effects between the multiple single-input-single-output (SISO) feedback control loops is ignored. It is also well-known that if the plant operates over a wide operating range, the dynamic behaviour changes thereby rendering the performance of an initially tuned PID controller unacceptable. The design of PID controllers, in general, is based on linear models that are obtained by linearizing a nonlinear system around a steady state operating point. For example, in peak seeking control, the sign of the process gain changes around the peak value, thereby invalidating the linear model obtained at the other side of the peak. Similarly, at other operating points, the multivariable plant may exhibit new dynamic features such as inverse response. This work proposes to use deep reinforcement learning (DRL) strategies to simultaneously tune multiple SISO PID controllers using a single DRL agent while enforcing interval constraints on the tuning parameter values. This ensures that interaction effects between the loops are directly factored in the tuning. Interval constraints also ensure safety of the plant during training by ensuring that the tuning parameter values are bounded in a stable region. Moreover, a trained agent when deployed, provides operating condition based PID parameters on the fly ensuring nonlinear compensation in the PID design. The methodology is demonstrated on a quadruple tank benchmark system via simulations by simultaneously tuning two PI level controllers. The same methodology is then adopted to tune PI controllers for the operating condition under which the plant exhibits a right half plane multivariable direction zero. Comparisons with PI controllers tuned with standard methods suggest that the proposed method is a viable approach, particularly when simulators are available for the plant dynamics.</p></div>","PeriodicalId":72815,"journal":{"name":"Digital Chemical Engineering","volume":"9 ","pages":"Article 100131"},"PeriodicalIF":3.0000,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772508123000492/pdfft?md5=955833049b05399f7499873d259f2bbe&pid=1-s2.0-S2772508123000492-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Chemical Engineering","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772508123000492","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, CHEMICAL","Score":null,"Total":0}

引用次数: 0

Abstract

Traditionally, tuning of PID controllers is based on linear approximation of the dynamics between the manipulated input and the controlled output. The tuning is performed one loop at a time and interaction effects between the multiple single-input-single-output (SISO) feedback control loops is ignored. It is also well-known that if the plant operates over a wide operating range, the dynamic behaviour changes thereby rendering the performance of an initially tuned PID controller unacceptable. The design of PID controllers, in general, is based on linear models that are obtained by linearizing a nonlinear system around a steady state operating point. For example, in peak seeking control, the sign of the process gain changes around the peak value, thereby invalidating the linear model obtained at the other side of the peak. Similarly, at other operating points, the multivariable plant may exhibit new dynamic features such as inverse response. This work proposes to use deep reinforcement learning (DRL) strategies to simultaneously tune multiple SISO PID controllers using a single DRL agent while enforcing interval constraints on the tuning parameter values. This ensures that interaction effects between the loops are directly factored in the tuning. Interval constraints also ensure safety of the plant during training by ensuring that the tuning parameter values are bounded in a stable region. Moreover, a trained agent when deployed, provides operating condition based PID parameters on the fly ensuring nonlinear compensation in the PID design. The methodology is demonstrated on a quadruple tank benchmark system via simulations by simultaneously tuning two PI level controllers. The same methodology is then adopted to tune PI controllers for the operating condition under which the plant exhibits a right half plane multivariable direction zero. Comparisons with PI controllers tuned with standard methods suggest that the proposed method is a viable approach, particularly when simulators are available for the plant dynamics.

查看原文本刊更多论文

基于深度强化学习的多变量系统多PID控制器同时整定

传统上，PID控制器的调节是基于操纵输入和受控输出之间的动力学的线性近似。调谐一次执行一个环路，并且忽略多个单输入单输出（SISO）反馈控制环路之间的相互作用效应。众所周知，如果工厂在较宽的操作范围内运行，则动态行为会发生变化，从而使最初调整的PID控制器的性能不可接受。PID控制器的设计通常基于线性模型，该模型是通过将非线性系统在稳态操作点附近线性化而获得的。例如，在峰值寻找控制中，过程增益的符号在峰值附近变化，从而使在峰值的另一侧获得的线性模型无效。类似地，在其他操作点，多变量对象可能表现出新的动态特征，例如逆响应。这项工作提出使用深度强化学习（DRL）策略，使用单个DRL代理同时调整多个SISO PID控制器，同时对调整参数值施加区间约束。这确保了在调优中直接考虑循环之间的交互效果。区间约束还通过确保调谐参数值在稳定区域内有界来确保训练期间设备的安全。此外，经过训练的代理在部署时，实时提供基于操作条件的PID参数，确保PID设计中的非线性补偿。通过同时调整两个PI液位控制器的仿真，在四缸基准系统上演示了该方法。然后采用相同的方法来调整PI控制器，以适应设备呈现右半平面多变量方向零点的操作条件。与用标准方法调谐的PI控制器的比较表明，所提出的方法是一种可行的方法，特别是当模拟器可用于工厂动力学时。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Digital Chemical Engineering

CiteScore

3.10

自引率

0.00%

发文量