利用 IS-MPC 实现基于强化学习的全身间歇控制参数优化

Nícolas Figueroa, Julio Tafur, A. Kheddar
{"title":"利用 IS-MPC 实现基于强化学习的全身间歇控制参数优化","authors":"Nícolas Figueroa, Julio Tafur, A. Kheddar","doi":"10.1109/SII58957.2024.10417367","DOIUrl":null,"url":null,"abstract":"Maintaining stability in bipedal walking remains a significant challenge in humanoid robotics, largely due to the numerous involved hyperparameters. Traditional methods for determining these hyperparameters, such as heuristic approaches, can be both time-consuming and potentially suboptimal. In this paper, we present an approach aimed at enhancing the stability of bipedal gait, particularly when faced with floor perturbations and speed variations. Our main contribution is the integration of intrinsically stable model predictive control (IS-MPC) and whole-body admittance control within a closed-loop reinforcement learning system. We devised a reinforcement learning plugin, implemented in the mc_rtc framework, that allows the control system to continuously monitor the robot's current states, maintain recursive feasibility, and optimize parameters in real-time. Furthermore, we propose a reward function derived from a combination of changes in single and double support time, postural recovery, divergent control of motion, and action generation grounded in training optimization. In the course of this research, we conducted experiments on a real humanoid robot to validate initial aspects of our work. The integrated module's effectiveness was further assessed through comprehensive simulations.","PeriodicalId":518021,"journal":{"name":"2024 IEEE/SICE International Symposium on System Integration (SII)","volume":"19 10","pages":"1405-1410"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Reinforcement Learning-Based Parameter Optimization for Whole-Body Admittance Control with IS-MPC\",\"authors\":\"Nícolas Figueroa, Julio Tafur, A. Kheddar\",\"doi\":\"10.1109/SII58957.2024.10417367\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Maintaining stability in bipedal walking remains a significant challenge in humanoid robotics, largely due to the numerous involved hyperparameters. Traditional methods for determining these hyperparameters, such as heuristic approaches, can be both time-consuming and potentially suboptimal. In this paper, we present an approach aimed at enhancing the stability of bipedal gait, particularly when faced with floor perturbations and speed variations. Our main contribution is the integration of intrinsically stable model predictive control (IS-MPC) and whole-body admittance control within a closed-loop reinforcement learning system. We devised a reinforcement learning plugin, implemented in the mc_rtc framework, that allows the control system to continuously monitor the robot's current states, maintain recursive feasibility, and optimize parameters in real-time. Furthermore, we propose a reward function derived from a combination of changes in single and double support time, postural recovery, divergent control of motion, and action generation grounded in training optimization. In the course of this research, we conducted experiments on a real humanoid robot to validate initial aspects of our work. The integrated module's effectiveness was further assessed through comprehensive simulations.\",\"PeriodicalId\":518021,\"journal\":{\"name\":\"2024 IEEE/SICE International Symposium on System Integration (SII)\",\"volume\":\"19 10\",\"pages\":\"1405-1410\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2024 IEEE/SICE International Symposium on System Integration (SII)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SII58957.2024.10417367\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2024 IEEE/SICE International Symposium on System Integration (SII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SII58957.2024.10417367","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在仿人机器人技术中,保持双足行走的稳定性仍然是一项重大挑战,这主要是由于涉及众多超参数。确定这些超参数的传统方法(如启发式方法)既耗时又可能是次优的。在本文中,我们提出了一种旨在增强双足步态稳定性的方法,尤其是在面对地面扰动和速度变化时。我们的主要贡献在于将本征稳定模型预测控制(IS-MPC)和全身导纳控制整合到一个闭环强化学习系统中。我们设计了一个在 mc_rtc 框架中实现的强化学习插件,使控制系统能够持续监控机器人的当前状态,保持递归可行性,并实时优化参数。此外,我们还提出了一种奖励函数,该函数由单人和双人支撑时间的变化、姿势恢复、运动的发散控制以及基于训练优化的动作生成等因素组合而成。在研究过程中,我们在一个真实的仿人机器人上进行了实验,以验证我们工作的初始方面。我们还通过综合模拟进一步评估了集成模块的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Reinforcement Learning-Based Parameter Optimization for Whole-Body Admittance Control with IS-MPC
Maintaining stability in bipedal walking remains a significant challenge in humanoid robotics, largely due to the numerous involved hyperparameters. Traditional methods for determining these hyperparameters, such as heuristic approaches, can be both time-consuming and potentially suboptimal. In this paper, we present an approach aimed at enhancing the stability of bipedal gait, particularly when faced with floor perturbations and speed variations. Our main contribution is the integration of intrinsically stable model predictive control (IS-MPC) and whole-body admittance control within a closed-loop reinforcement learning system. We devised a reinforcement learning plugin, implemented in the mc_rtc framework, that allows the control system to continuously monitor the robot's current states, maintain recursive feasibility, and optimize parameters in real-time. Furthermore, we propose a reward function derived from a combination of changes in single and double support time, postural recovery, divergent control of motion, and action generation grounded in training optimization. In the course of this research, we conducted experiments on a real humanoid robot to validate initial aspects of our work. The integrated module's effectiveness was further assessed through comprehensive simulations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信