{"title":"基于神经动力学模型的强化学习 MPC","authors":"Saket Adhau , Sébastien Gros , Sigurd Skogestad","doi":"10.1016/j.ejcon.2024.101048","DOIUrl":null,"url":null,"abstract":"<div><div>This paper presents an end-to-end learning approach to developing a Nonlinear Model Predictive Control (NMPC) policy, which does not require an explicit first-principles model and assumes that the system dynamics are either unknown or partially known. The paper proposes the use of available measurements to identify a nominal Recurrent Neural Network (RNN) model to capture the nonlinear dynamics, which includes constraints on the state variables and inputs. To address the issue of suboptimal control policies resulting from simply fitting the model to the data, this paper uses Reinforcement learning (RL) to tune the NMPC scheme and generate an optimal policy for the real system. The approach’s novelty lies in the use of RL to overcome the limitations of the nominal RNN model and generate a more accurate control policy. The paper discusses the implementation aspects of initial state estimation for RNN models and integration of neural models in MPC. The presented method is demonstrated on a classic benchmark control problem: cascaded two tank system (CTS).</div></div>","PeriodicalId":50489,"journal":{"name":"European Journal of Control","volume":"80 ","pages":"Article 101048"},"PeriodicalIF":2.5000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement learning based MPC with neural dynamical models\",\"authors\":\"Saket Adhau , Sébastien Gros , Sigurd Skogestad\",\"doi\":\"10.1016/j.ejcon.2024.101048\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>This paper presents an end-to-end learning approach to developing a Nonlinear Model Predictive Control (NMPC) policy, which does not require an explicit first-principles model and assumes that the system dynamics are either unknown or partially known. The paper proposes the use of available measurements to identify a nominal Recurrent Neural Network (RNN) model to capture the nonlinear dynamics, which includes constraints on the state variables and inputs. To address the issue of suboptimal control policies resulting from simply fitting the model to the data, this paper uses Reinforcement learning (RL) to tune the NMPC scheme and generate an optimal policy for the real system. The approach’s novelty lies in the use of RL to overcome the limitations of the nominal RNN model and generate a more accurate control policy. The paper discusses the implementation aspects of initial state estimation for RNN models and integration of neural models in MPC. The presented method is demonstrated on a classic benchmark control problem: cascaded two tank system (CTS).</div></div>\",\"PeriodicalId\":50489,\"journal\":{\"name\":\"European Journal of Control\",\"volume\":\"80 \",\"pages\":\"Article 101048\"},\"PeriodicalIF\":2.5000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Control\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0947358024001080\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"AUTOMATION & CONTROL SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Control","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0947358024001080","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"AUTOMATION & CONTROL SYSTEMS","Score":null,"Total":0}
Reinforcement learning based MPC with neural dynamical models
This paper presents an end-to-end learning approach to developing a Nonlinear Model Predictive Control (NMPC) policy, which does not require an explicit first-principles model and assumes that the system dynamics are either unknown or partially known. The paper proposes the use of available measurements to identify a nominal Recurrent Neural Network (RNN) model to capture the nonlinear dynamics, which includes constraints on the state variables and inputs. To address the issue of suboptimal control policies resulting from simply fitting the model to the data, this paper uses Reinforcement learning (RL) to tune the NMPC scheme and generate an optimal policy for the real system. The approach’s novelty lies in the use of RL to overcome the limitations of the nominal RNN model and generate a more accurate control policy. The paper discusses the implementation aspects of initial state estimation for RNN models and integration of neural models in MPC. The presented method is demonstrated on a classic benchmark control problem: cascaded two tank system (CTS).
期刊介绍:
The European Control Association (EUCA) has among its objectives to promote the development of the discipline. Apart from the European Control Conferences, the European Journal of Control is the Association''s main channel for the dissemination of important contributions in the field.
The aim of the Journal is to publish high quality papers on the theory and practice of control and systems engineering.
The scope of the Journal will be wide and cover all aspects of the discipline including methodologies, techniques and applications.
Research in control and systems engineering is necessary to develop new concepts and tools which enhance our understanding and improve our ability to design and implement high performance control systems. Submitted papers should stress the practical motivations and relevance of their results.
The design and implementation of a successful control system requires the use of a range of techniques:
Modelling
Robustness Analysis
Identification
Optimization
Control Law Design
Numerical analysis
Fault Detection, and so on.