一种用于连续软机器人的演员-评论家运动强化学习方法

IF 2.9 Q2 ROBOTICS
Robotics Pub Date : 2023-10-09 DOI:10.3390/robotics12050141
Luis Pantoja-Garcia, Vicente Parra-Vega, Rodolfo Garcia-Rodriguez, Carlos Ernesto Vázquez-García
{"title":"一种用于连续软机器人的演员-评论家运动强化学习方法","authors":"Luis Pantoja-Garcia, Vicente Parra-Vega, Rodolfo Garcia-Rodriguez, Carlos Ernesto Vázquez-García","doi":"10.3390/robotics12050141","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) is explored for motor control of a novel pneumatic-driven soft robot modeled after continuum media with a varying density. This model complies with closed-form Lagrangian dynamics, which fulfills the fundamental structural property of passivity, among others. Then, the question arises of how to synthesize a passivity-based RL model to control the unknown continuum soft robot dynamics to exploit its input–output energy properties advantageously throughout a reward-based neural network controller. Thus, we propose a continuous-time Actor–Critic scheme for tracking tasks of the continuum 3D soft robot subject to Lipschitz disturbances. A reward-based temporal difference leads to learning with a novel discontinuous adaptive mechanism of Critic neural weights. Finally, the reward and integral of the Bellman error approximation reinforce the adaptive mechanism of Actor neural weights. Closed-loop stability is guaranteed in the sense of Lyapunov, which leads to local exponential convergence of tracking errors based on integral sliding modes. Notably, it is assumed that dynamics are unknown, yet the control is continuous and robust. A representative simulation study shows the effectiveness of our proposal for tracking tasks.","PeriodicalId":37568,"journal":{"name":"Robotics","volume":"51 1","pages":"0"},"PeriodicalIF":2.9000,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Novel Actor—Critic Motor Reinforcement Learning for Continuum Soft Robots\",\"authors\":\"Luis Pantoja-Garcia, Vicente Parra-Vega, Rodolfo Garcia-Rodriguez, Carlos Ernesto Vázquez-García\",\"doi\":\"10.3390/robotics12050141\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL) is explored for motor control of a novel pneumatic-driven soft robot modeled after continuum media with a varying density. This model complies with closed-form Lagrangian dynamics, which fulfills the fundamental structural property of passivity, among others. Then, the question arises of how to synthesize a passivity-based RL model to control the unknown continuum soft robot dynamics to exploit its input–output energy properties advantageously throughout a reward-based neural network controller. Thus, we propose a continuous-time Actor–Critic scheme for tracking tasks of the continuum 3D soft robot subject to Lipschitz disturbances. A reward-based temporal difference leads to learning with a novel discontinuous adaptive mechanism of Critic neural weights. Finally, the reward and integral of the Bellman error approximation reinforce the adaptive mechanism of Actor neural weights. Closed-loop stability is guaranteed in the sense of Lyapunov, which leads to local exponential convergence of tracking errors based on integral sliding modes. Notably, it is assumed that dynamics are unknown, yet the control is continuous and robust. A representative simulation study shows the effectiveness of our proposal for tracking tasks.\",\"PeriodicalId\":37568,\"journal\":{\"name\":\"Robotics\",\"volume\":\"51 1\",\"pages\":\"0\"},\"PeriodicalIF\":2.9000,\"publicationDate\":\"2023-10-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.3390/robotics12050141\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3390/robotics12050141","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 1

摘要

研究了一种基于变密度连续介质模型的新型气动软机器人的强化学习控制方法。该模型符合闭型拉格朗日动力学,满足无源性等基本结构性质。然后,如何综合基于被动的强化学习模型来控制未知连续体软机器人动力学,从而在基于奖励的神经网络控制器中有利地利用其输入输出能量特性。因此,我们提出了一种连续时间Actor-Critic方案,用于连续三维软机器人在Lipschitz扰动下的跟踪任务。基于奖励的时间差异导致了一种新的批评性神经权重的不连续适应机制。最后,Bellman误差逼近的奖励和积分强化了Actor神经权值的自适应机制。在Lyapunov意义下保证闭环的稳定性,使得基于积分滑模的跟踪误差的局部指数收敛。值得注意的是,它假设动力学是未知的,但控制是连续的和鲁棒的。一个有代表性的仿真研究表明了我们提出的跟踪任务的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A Novel Actor—Critic Motor Reinforcement Learning for Continuum Soft Robots
Reinforcement learning (RL) is explored for motor control of a novel pneumatic-driven soft robot modeled after continuum media with a varying density. This model complies with closed-form Lagrangian dynamics, which fulfills the fundamental structural property of passivity, among others. Then, the question arises of how to synthesize a passivity-based RL model to control the unknown continuum soft robot dynamics to exploit its input–output energy properties advantageously throughout a reward-based neural network controller. Thus, we propose a continuous-time Actor–Critic scheme for tracking tasks of the continuum 3D soft robot subject to Lipschitz disturbances. A reward-based temporal difference leads to learning with a novel discontinuous adaptive mechanism of Critic neural weights. Finally, the reward and integral of the Bellman error approximation reinforce the adaptive mechanism of Actor neural weights. Closed-loop stability is guaranteed in the sense of Lyapunov, which leads to local exponential convergence of tracking errors based on integral sliding modes. Notably, it is assumed that dynamics are unknown, yet the control is continuous and robust. A representative simulation study shows the effectiveness of our proposal for tracking tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Robotics
Robotics Mathematics-Control and Optimization
CiteScore
6.70
自引率
8.10%
发文量
114
审稿时长
11 weeks
期刊介绍: Robotics publishes original papers, technical reports, case studies, review papers and tutorials in all the aspects of robotics. Special Issues devoted to important topics in advanced robotics will be published from time to time. It particularly welcomes those emerging methodologies and techniques which bridge theoretical studies and applications and have significant potential for real-world applications. It provides a forum for information exchange between professionals, academicians and engineers who are working in the area of robotics, helping them to disseminate research findings and to learn from each other’s work. Suitable topics include, but are not limited to: -intelligent robotics, mechatronics, and biomimetics -novel and biologically-inspired robotics -modelling, identification and control of robotic systems -biomedical, rehabilitation and surgical robotics -exoskeletons, prosthetics and artificial organs -AI, neural networks and fuzzy logic in robotics -multimodality human-machine interaction -wireless sensor networks for robot navigation -multi-sensor data fusion and SLAM
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信