Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Nonlinear Systems

IEEE open journal of control systems Pub Date : 2025-03-18 DOI:10.1109/OJCSYS.2025.3552999

Ashwin P. Dani;Shubhendu Bhasin

{"title":"Adaptive Actor-Critic Based Optimal Regulation for Drift-Free Nonlinear Systems","authors":"Ashwin P. Dani;Shubhendu Bhasin","doi":"10.1109/OJCSYS.2025.3552999","DOIUrl":null,"url":null,"abstract":"In this paper, a continuous-time adaptive actor-critic reinforcement learning (RL) controller is developed for drift-free uncertain nonlinear systems. Practical examples of such systems are image-based visual servoing (IBVS) and wheeled mobile robots (WMR), where the system dynamics include a parametric uncertainty in the control effectiveness matrix with no drift term. The uncertainty in the input term poses a challenge when developing a continuous-time RL controller using existing methods. This paper presents an actor-critic/synchronous policy iteration (PI)-based RL controller with a newly derived constrained concurrent learning (CCL)-based parameter update law for estimating the unknown parameters of the linearly parametrized control effectiveness matrix. The parameter update law ensures that the parameters do not converge to <inline-formula><tex-math>$zero$</tex-math></inline-formula>, avoiding possible loss of stabilization. An infinite-horizon value function minimization objective is achieved by regulating the current states to the desired with near-optimal control efforts. The proposed controller guarantees closed-loop stability, and simulation results in the presence of noise validate the proposed theory using IBVS and WMR examples.","PeriodicalId":73299,"journal":{"name":"IEEE open journal of control systems","volume":"4 ","pages":"117-129"},"PeriodicalIF":0.0000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10932715","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE open journal of control systems","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10932715/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In this paper, a continuous-time adaptive actor-critic reinforcement learning (RL) controller is developed for drift-free uncertain nonlinear systems. Practical examples of such systems are image-based visual servoing (IBVS) and wheeled mobile robots (WMR), where the system dynamics include a parametric uncertainty in the control effectiveness matrix with no drift term. The uncertainty in the input term poses a challenge when developing a continuous-time RL controller using existing methods. This paper presents an actor-critic/synchronous policy iteration (PI)-based RL controller with a newly derived constrained concurrent learning (CCL)-based parameter update law for estimating the unknown parameters of the linearly parametrized control effectiveness matrix. The parameter update law ensures that the parameters do not converge to

$zero$

, avoiding possible loss of stabilization. An infinite-horizon value function minimization objective is achieved by regulating the current states to the desired with near-optimal control efforts. The proposed controller guarantees closed-loop stability, and simulation results in the presence of noise validate the proposed theory using IBVS and WMR examples.

查看原文本刊更多论文

基于自适应因子评价的无漂移非线性系统最优调节

针对无漂移不确定非线性系统，提出了一种连续时间自适应行为者评价强化学习（RL）控制器。此类系统的实际示例是基于图像的视觉伺服（IBVS）和轮式移动机器人（WMR），其中系统动力学包括控制有效性矩阵中没有漂移项的参数不确定性。在使用现有方法开发连续时间RL控制器时，输入项的不确定性提出了挑战。本文提出了一种基于actor-critic/synchronous policy iteration （PI）的RL控制器，该控制器采用了一种新的基于约束并发学习（CCL）的参数更新律来估计线性参数化控制有效性矩阵的未知参数。参数更新律保证了参数不收敛于零，避免了可能的镇定损失。利用接近最优的控制努力将当前状态调节到理想状态，从而实现了无限视界值函数的最小化目标。所提出的控制器保证了闭环稳定性，并且在存在噪声的情况下，使用IBVS和WMR实例的仿真结果验证了所提出的理论。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE open journal of control systems

自引率

0.00%

发文量