SYNLOCO‐VE: Synthesizing central pattern generator with reinforcement learning and velocity estimator for quadruped locomotion

Xinyu Zhang, Zhiyuan Xiao, Xiang Zhou, Qingrui Zhang
{"title":"SYNLOCO‐VE: Synthesizing central pattern generator with reinforcement learning and velocity estimator for quadruped locomotion","authors":"Xinyu Zhang, Zhiyuan Xiao, Xiang Zhou, Qingrui Zhang","doi":"10.1002/oca.3181","DOIUrl":null,"url":null,"abstract":"It is a challenging task to learn a robust and natural locomotion controller for quadruped robots at different terrains and velocities. In particular, the locomotion learning task will be even more difficult for the case with no exteroceptive sensors. In this article, the learning‐based locomotion control is, therefore, investigated for quadruped robots only using proprioceptive sensors. A new framework called SYNLOCO‐VE is proposed by synthesizing a feedforward gait planner, a trunk velocity estimator, and reinforcement learning (RL). The feedforward gait planner is developed based on the well‐known central pattern generator, but it can change the foot length for improved velocity tracking performance. The trunk velocity estimator is designed based on deep learning, which estimates the trunk velocity using historical data from proprioceptive sensors. The introduction of the trunk velocity estimator can mitigate the influence of the partial observation issue due to the lack of exteroceptive sensors. RL is employed to learn a feedback controller to regulate the robot gaits using feedback from proprioceptive sensors and the trunk velocity estimation. In the proposed framework, the feedforward gait planner can also guide the training process of RL, thus resulting in more stable and faster policy learning. Ablation studies are provided to demonstrate the efficiency of different modules in the proposed design. Extensive experiments are performed using a quadruped robot Go1, which only has proprioceptive sensors. The proposed framework is able to learn robust and stable locomotion at different terrains and tasks. Experimental comparisons are also conducted to illustrate the advantages of the proposed design over the state‐of‐the‐art methods.","PeriodicalId":501055,"journal":{"name":"Optimal Control Applications and Methods","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optimal Control Applications and Methods","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/oca.3181","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

It is a challenging task to learn a robust and natural locomotion controller for quadruped robots at different terrains and velocities. In particular, the locomotion learning task will be even more difficult for the case with no exteroceptive sensors. In this article, the learning‐based locomotion control is, therefore, investigated for quadruped robots only using proprioceptive sensors. A new framework called SYNLOCO‐VE is proposed by synthesizing a feedforward gait planner, a trunk velocity estimator, and reinforcement learning (RL). The feedforward gait planner is developed based on the well‐known central pattern generator, but it can change the foot length for improved velocity tracking performance. The trunk velocity estimator is designed based on deep learning, which estimates the trunk velocity using historical data from proprioceptive sensors. The introduction of the trunk velocity estimator can mitigate the influence of the partial observation issue due to the lack of exteroceptive sensors. RL is employed to learn a feedback controller to regulate the robot gaits using feedback from proprioceptive sensors and the trunk velocity estimation. In the proposed framework, the feedforward gait planner can also guide the training process of RL, thus resulting in more stable and faster policy learning. Ablation studies are provided to demonstrate the efficiency of different modules in the proposed design. Extensive experiments are performed using a quadruped robot Go1, which only has proprioceptive sensors. The proposed framework is able to learn robust and stable locomotion at different terrains and tasks. Experimental comparisons are also conducted to illustrate the advantages of the proposed design over the state‐of‐the‐art methods.
SYNLOCO-VE:用于四足运动的具有强化学习和速度估计功能的合成中央模式发生器
在不同的地形和速度下,为四足机器人学习稳健自然的运动控制器是一项极具挑战性的任务。特别是在没有外感知传感器的情况下,运动学习任务将更加困难。因此,本文研究了仅使用本体感觉传感器的四足机器人基于学习的运动控制。通过综合前馈步态规划器、躯干速度估计器和强化学习(RL),提出了一个名为 SYNLOCO-VE 的新框架。前馈步态规划器是基于著名的中央模式发生器开发的,但它可以改变脚的长度,以提高速度跟踪性能。躯干速度估算器是基于深度学习设计的,它利用本体感觉传感器的历史数据估算躯干速度。躯干速度估算器的引入可以减轻由于缺乏外感觉传感器而产生的部分观察问题的影响。采用 RL 学习反馈控制器,利用本体感觉传感器的反馈和躯干速度估计来调节机器人的步态。在所提出的框架中,前馈步态规划器还可以指导 RL 的训练过程,从而实现更稳定、更快速的策略学习。为了证明拟议设计中不同模块的效率,我们进行了消融研究。使用四足机器人 Go1 进行了大量实验,该机器人只有本体感觉传感器。所提出的框架能够在不同的地形和任务中学习稳健而稳定的运动。同时还进行了实验比较,以说明与最先进的方法相比,所提出的设计具有哪些优势。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信