利用行为批判强化学习实现欠驱动水面船只的自适应优化跟踪控制

IF 8.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

IEEE transactions on neural networks and learning systems Pub Date : 2022-11-30 DOI:10.1109/TNNLS.2022.3214681

Lin Chen;Shi-Lu Dai;Chao Dong

{"title":"利用行为批判强化学习实现欠驱动水面船只的自适应优化跟踪控制","authors":"Lin Chen;Shi-Lu Dai;Chao Dong","doi":"10.1109/TNNLS.2022.3214681","DOIUrl":null,"url":null,"abstract":"In this article, we present an adaptive reinforcement learning optimal tracking control (RLOTC) algorithm for an underactuated surface vessel subject to modeling uncertainties and time-varying external disturbances. By integrating backstepping technique with the optimized control design, we show that the desired optimal tracking performance of vessel control is guaranteed due to the fact that the virtual and actual control inputs are designed as optimized solutions of every subsystem. To enhance the robustness of vessel control systems, we employ neural network (NN) approximators to approximate uncertain vessel dynamics and present adaptive control technique to estimate the upper boundedness of external disturbances. Under the reinforcement learning framework, we construct actor–critic networks to solve the Hamilton–Jacobi–Bellman equations corresponding to subsystems of surface vessel to achieve the optimized control. The optimized control algorithm can synchronously train the adaptive parameters not only for actor–critic networks but also for NN approximators and adaptive control. By Lyapunov stability theorem, we show that the RLOTC algorithm can ensure the semiglobal uniform ultimate boundedness of the closed-loop systems. Compared with the existing reinforcement learning control results, the presented RLOTC algorithm can compensate for uncertain vessel dynamics and unknown disturbances, and obtain the optimized control performance by considering optimization in every backstepping design. Simulation studies on an underactuated surface vessel are given to illustrate the effectiveness of the RLOTC algorithm.","PeriodicalId":13303,"journal":{"name":"IEEE transactions on neural networks and learning systems","volume":"35 6","pages":"7520-7533"},"PeriodicalIF":8.9000,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor–Critic Reinforcement Learning\",\"authors\":\"Lin Chen;Shi-Lu Dai;Chao Dong\",\"doi\":\"10.1109/TNNLS.2022.3214681\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this article, we present an adaptive reinforcement learning optimal tracking control (RLOTC) algorithm for an underactuated surface vessel subject to modeling uncertainties and time-varying external disturbances. By integrating backstepping technique with the optimized control design, we show that the desired optimal tracking performance of vessel control is guaranteed due to the fact that the virtual and actual control inputs are designed as optimized solutions of every subsystem. To enhance the robustness of vessel control systems, we employ neural network (NN) approximators to approximate uncertain vessel dynamics and present adaptive control technique to estimate the upper boundedness of external disturbances. Under the reinforcement learning framework, we construct actor–critic networks to solve the Hamilton–Jacobi–Bellman equations corresponding to subsystems of surface vessel to achieve the optimized control. The optimized control algorithm can synchronously train the adaptive parameters not only for actor–critic networks but also for NN approximators and adaptive control. By Lyapunov stability theorem, we show that the RLOTC algorithm can ensure the semiglobal uniform ultimate boundedness of the closed-loop systems. Compared with the existing reinforcement learning control results, the presented RLOTC algorithm can compensate for uncertain vessel dynamics and unknown disturbances, and obtain the optimized control performance by considering optimization in every backstepping design. Simulation studies on an underactuated surface vessel are given to illustrate the effectiveness of the RLOTC algorithm.\",\"PeriodicalId\":13303,\"journal\":{\"name\":\"IEEE transactions on neural networks and learning systems\",\"volume\":\"35 6\",\"pages\":\"7520-7533\"},\"PeriodicalIF\":8.9000,\"publicationDate\":\"2022-11-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE transactions on neural networks and learning systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/9967790/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on neural networks and learning systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/9967790/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了一种自适应强化学习优化跟踪控制（RLOTC）算法，适用于受建模不确定性和时变外部干扰影响的欠驱动水面舰艇。通过将反向步进技术与优化控制设计相结合，我们证明，由于虚拟和实际控制输入被设计为每个子系统的优化解，船舶控制的理想最佳跟踪性能得到了保证。为了增强船舶控制系统的鲁棒性，我们采用神经网络（NN）近似器来近似不确定的船舶动力学，并提出了自适应控制技术来估计外部干扰的上限。在强化学习框架下，我们构建了行为批判网络来求解与水面船只子系统相对应的汉密尔顿-贾可比-贝尔曼方程，从而实现优化控制。优化控制算法不仅能同步训练行为批判网络的自适应参数，还能同步训练 NN 近似器和自适应控制。通过李雅普诺夫稳定性定理，我们证明了 RLOTC 算法可以确保闭环系统的半全局均匀终极有界性。与现有的强化学习控制结果相比，所提出的 RLOTC 算法可以补偿不确定的船舶动态和未知干扰，并通过在每个反步进设计中考虑优化来获得优化的控制性能。本文对一艘欠驱动水面船只进行了仿真研究，以说明 RLOTC 算法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Adaptive Optimal Tracking Control of an Underactuated Surface Vessel Using Actor–Critic Reinforcement Learning

In this article, we present an adaptive reinforcement learning optimal tracking control (RLOTC) algorithm for an underactuated surface vessel subject to modeling uncertainties and time-varying external disturbances. By integrating backstepping technique with the optimized control design, we show that the desired optimal tracking performance of vessel control is guaranteed due to the fact that the virtual and actual control inputs are designed as optimized solutions of every subsystem. To enhance the robustness of vessel control systems, we employ neural network (NN) approximators to approximate uncertain vessel dynamics and present adaptive control technique to estimate the upper boundedness of external disturbances. Under the reinforcement learning framework, we construct actor–critic networks to solve the Hamilton–Jacobi–Bellman equations corresponding to subsystems of surface vessel to achieve the optimized control. The optimized control algorithm can synchronously train the adaptive parameters not only for actor–critic networks but also for NN approximators and adaptive control. By Lyapunov stability theorem, we show that the RLOTC algorithm can ensure the semiglobal uniform ultimate boundedness of the closed-loop systems. Compared with the existing reinforcement learning control results, the presented RLOTC algorithm can compensate for uncertain vessel dynamics and unknown disturbances, and obtain the optimized control performance by considering optimization in every backstepping design. Simulation studies on an underactuated surface vessel are given to illustrate the effectiveness of the RLOTC algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE transactions on neural networks and learning systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

CiteScore

23.80

自引率

9.60%

发文量

2102

审稿时长

3-8 weeks

期刊介绍： The focus of IEEE Transactions on Neural Networks and Learning Systems is to present scholarly articles discussing the theory, design, and applications of neural networks as well as other learning systems. The journal primarily highlights technical and scientific research in this domain.