Learning-based locomotion control fusing multimodal perception for a bipedal humanoid robot

Chao Ji , Diyuan Liu , Wei Gao , Shiwu Zhang
{"title":"Learning-based locomotion control fusing multimodal perception for a bipedal humanoid robot","authors":"Chao Ji ,&nbsp;Diyuan Liu ,&nbsp;Wei Gao ,&nbsp;Shiwu Zhang","doi":"10.1016/j.birob.2025.100213","DOIUrl":null,"url":null,"abstract":"<div><div>The ability of bipedal humanoid robots to walk adaptively on varied terrain is a critical challenge for practical applications, drawing substantial attention from academic and industrial research communities in recent years. Traditional model-based locomotion control methods have high modeling complexity, especially in complex terrain environments, making locomotion stability difficult to ensure. Reinforcement learning offers an end-to-end solution for locomotion control in humanoid robots. This approach typically relies solely on proprioceptive sensing to generate control policies, often resulting in increased robot body collisions during practical applications. Excessive collisions can damage the biped robot hardware, and more critically, the absence of multimodal input, such as vision, limits the robot’s ability to perceive environmental context and adjust its gait trajectory promptly. This lack of multimodal perception also hampers stability and robustness during tasks. In this paper, visual information is added to the locomotion control problem of humanoid robot, and a three-stage multi-objective constraint policy distillation optimization algorithm is innovantly proposed. The expert policies of different terrains to meet the requirements of gait aesthetics are trained through reinforcement learning, and these expert policies are distilled into student through policy distillation. Experimental results demonstrate a significant reduction in collision rates when utilizing a control policy that integrates multimodal perception, especially in challenging terrains like stairs, thresholds, and mixed surfaces. This advancement supports the practical deployment of bipedal humanoid robots.</div></div>","PeriodicalId":100184,"journal":{"name":"Biomimetic Intelligence and Robotics","volume":"5 1","pages":"Article 100213"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Biomimetic Intelligence and Robotics","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S266737972500004X","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The ability of bipedal humanoid robots to walk adaptively on varied terrain is a critical challenge for practical applications, drawing substantial attention from academic and industrial research communities in recent years. Traditional model-based locomotion control methods have high modeling complexity, especially in complex terrain environments, making locomotion stability difficult to ensure. Reinforcement learning offers an end-to-end solution for locomotion control in humanoid robots. This approach typically relies solely on proprioceptive sensing to generate control policies, often resulting in increased robot body collisions during practical applications. Excessive collisions can damage the biped robot hardware, and more critically, the absence of multimodal input, such as vision, limits the robot’s ability to perceive environmental context and adjust its gait trajectory promptly. This lack of multimodal perception also hampers stability and robustness during tasks. In this paper, visual information is added to the locomotion control problem of humanoid robot, and a three-stage multi-objective constraint policy distillation optimization algorithm is innovantly proposed. The expert policies of different terrains to meet the requirements of gait aesthetics are trained through reinforcement learning, and these expert policies are distilled into student through policy distillation. Experimental results demonstrate a significant reduction in collision rates when utilizing a control policy that integrates multimodal perception, especially in challenging terrains like stairs, thresholds, and mixed surfaces. This advancement supports the practical deployment of bipedal humanoid robots.
融合多模态感知的两足仿人机器人学习运动控制
两足类人机器人在不同地形上的自适应行走能力是实际应用中的一个关键挑战,近年来引起了学术界和工业界的广泛关注。传统的基于模型的运动控制方法建模复杂度高,特别是在复杂地形环境下,运动稳定性难以保证。强化学习为人形机器人的运动控制提供了端到端的解决方案。这种方法通常仅依靠本体感觉来产生控制策略,在实际应用中经常导致机器人身体碰撞的增加。过度的碰撞会损坏双足机器人的硬件,更重要的是,缺乏多模态输入,如视觉,限制了机器人感知环境背景和迅速调整步态轨迹的能力。这种多模态感知的缺乏也妨碍了任务期间的稳定性和鲁棒性。本文将视觉信息加入到仿人机器人的运动控制问题中,创新地提出了一种三阶段多目标约束策略蒸馏优化算法。通过强化学习训练出满足步态美学要求的不同地形专家策略,并通过策略蒸馏将这些专家策略提炼成学生策略。实验结果表明,当使用集成了多模态感知的控制策略时,碰撞率显著降低,特别是在楼梯、门槛和混合表面等具有挑战性的地形中。这一进展支持了双足类人机器人的实际部署。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
1.80
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信