{"title":"基于策略梯度的积分强化学习非仿射变形飞机系统最优控制设计","authors":"Hanna Lee, Seong-hun Kim, Youdan Kim","doi":"10.1109/MED48518.2020.9183024","DOIUrl":null,"url":null,"abstract":"An online model-free optimal control design strategy is proposed for general continuous-time nonlinear nonaffine systems using policy gradient-based integral reinforcement learning. In the case of the nonaffine system such as a morphing wing aircraft considering the morphing parameters as control effectors, general nonlinear control design method cannot be applied and solving the Hamilton-Jacobi-Bellman equation analytically is difficult. The proposed online optimal control algorithm is constructed based on the actor-critic structure using Q-function and policy gradient scheme and the integral reinforcement learning approach is used to develop the actor-critic parameter estimation for the continuous-time system. The closed-loop stability analysis for the designed method is presented. Through the proposed method, the optimal controller can be designed for the general nonaffine system, which has an advantage in terms of a computational issue for the complex system. Note that the entire dynamic model is not required. Simulation results demonstrate the effectiveness of the proposed scheme.","PeriodicalId":418518,"journal":{"name":"2020 28th Mediterranean Conference on Control and Automation (MED)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Policy Gradient-based Integral Reinforcement Learning for Optimal Control Design of Nonaffine Morphing Aircraft Systems\",\"authors\":\"Hanna Lee, Seong-hun Kim, Youdan Kim\",\"doi\":\"10.1109/MED48518.2020.9183024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"An online model-free optimal control design strategy is proposed for general continuous-time nonlinear nonaffine systems using policy gradient-based integral reinforcement learning. In the case of the nonaffine system such as a morphing wing aircraft considering the morphing parameters as control effectors, general nonlinear control design method cannot be applied and solving the Hamilton-Jacobi-Bellman equation analytically is difficult. The proposed online optimal control algorithm is constructed based on the actor-critic structure using Q-function and policy gradient scheme and the integral reinforcement learning approach is used to develop the actor-critic parameter estimation for the continuous-time system. The closed-loop stability analysis for the designed method is presented. Through the proposed method, the optimal controller can be designed for the general nonaffine system, which has an advantage in terms of a computational issue for the complex system. Note that the entire dynamic model is not required. Simulation results demonstrate the effectiveness of the proposed scheme.\",\"PeriodicalId\":418518,\"journal\":{\"name\":\"2020 28th Mediterranean Conference on Control and Automation (MED)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 28th Mediterranean Conference on Control and Automation (MED)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MED48518.2020.9183024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 28th Mediterranean Conference on Control and Automation (MED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MED48518.2020.9183024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Policy Gradient-based Integral Reinforcement Learning for Optimal Control Design of Nonaffine Morphing Aircraft Systems
An online model-free optimal control design strategy is proposed for general continuous-time nonlinear nonaffine systems using policy gradient-based integral reinforcement learning. In the case of the nonaffine system such as a morphing wing aircraft considering the morphing parameters as control effectors, general nonlinear control design method cannot be applied and solving the Hamilton-Jacobi-Bellman equation analytically is difficult. The proposed online optimal control algorithm is constructed based on the actor-critic structure using Q-function and policy gradient scheme and the integral reinforcement learning approach is used to develop the actor-critic parameter estimation for the continuous-time system. The closed-loop stability analysis for the designed method is presented. Through the proposed method, the optimal controller can be designed for the general nonaffine system, which has an advantage in terms of a computational issue for the complex system. Note that the entire dynamic model is not required. Simulation results demonstrate the effectiveness of the proposed scheme.