Optimized tracking control using reinforcement learning and backstepping technique for canonical nonlinear unknown dynamic system

Optimal Control Applications and Methods Pub Date : 2024-02-25 DOI:10.1002/oca.3115

Yanfen Song, Zijun Li, Guoxing Wen

{"title":"Optimized tracking control using reinforcement learning and backstepping technique for canonical nonlinear unknown dynamic system","authors":"Yanfen Song, Zijun Li, Guoxing Wen","doi":"10.1002/oca.3115","DOIUrl":null,"url":null,"abstract":"The work addresses the optimized tracking control problem by combining both reinforcement learning (RL) and backstepping technique for the canonical nonlinear unknown dynamic system. Since such dynamic system contains multiple state variables with differential relation, the backstepping technique is considered by making a virtual control sequence in accordance with Lyapunov functions. In the last backstepping step, the optimized actual control is derived by performing the RL under identifier-critic-actor structure, where RL is to overcome the difficulty coming from solving Hamilton-Jacobi-Bellman (HJB) equation. Different from the traditional RL optimizing methods that find the RL updating laws from the square of the HJB equation's approximation, this optimized control is to find the RL training laws from the negative gradient of a simple positive definite function, which is equivalent to the HJB equation. The result shows that this optimized control can obviously alleviate the algorithm complexity. Meanwhile, it can remove the requirement of known dynamic as well. Finally, theory and simulation indicate the feasibility of this optimized control.","PeriodicalId":501055,"journal":{"name":"Optimal Control Applications and Methods","volume":"2014 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optimal Control Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oca.3115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The work addresses the optimized tracking control problem by combining both reinforcement learning (RL) and backstepping technique for the canonical nonlinear unknown dynamic system. Since such dynamic system contains multiple state variables with differential relation, the backstepping technique is considered by making a virtual control sequence in accordance with Lyapunov functions. In the last backstepping step, the optimized actual control is derived by performing the RL under identifier-critic-actor structure, where RL is to overcome the difficulty coming from solving Hamilton-Jacobi-Bellman (HJB) equation. Different from the traditional RL optimizing methods that find the RL updating laws from the square of the HJB equation's approximation, this optimized control is to find the RL training laws from the negative gradient of a simple positive definite function, which is equivalent to the HJB equation. The result shows that this optimized control can obviously alleviate the algorithm complexity. Meanwhile, it can remove the requirement of known dynamic as well. Finally, theory and simulation indicate the feasibility of this optimized control.

Abstract Image

查看原文本刊更多论文

针对典型非线性未知动态系统使用强化学习和反步进技术进行优化跟踪控制

该研究针对典型非线性未知动态系统，结合强化学习（RL）和反步技术，解决了优化跟踪控制问题。由于这种动态系统包含具有微分关系的多个状态变量，因此考虑采用反向步进技术，根据 Lyapunov 函数建立虚拟控制序列。在最后一个反步进步骤中，通过在标识符-批判者-作用者结构下执行 RL 得出优化的实际控制，其中 RL 是为了克服求解汉密尔顿-雅各比-贝尔曼（HJB）方程所带来的困难。与传统的 RL 优化方法从 HJB 方程近似值的平方中寻找 RL 更新规律不同，该优化控制是从一个简单正定函数的负梯度中寻找 RL 训练规律，该函数等价于 HJB 方程。结果表明，这种优化控制可以明显减轻算法的复杂性。同时，它还能消除对已知动态的要求。最后，理论和仿真表明了这种优化控制的可行性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Optimal Control Applications and Methods

自引率

0.00%

发文量