Poincaré-基于地图的两足行走强化学习

Proceedings of the 2005 IEEE International Conference on Robotics and Automation Pub Date : 2005-04-18 DOI:10.1109/ROBOT.2005.1570469

J. Morimoto, J. Nakanishi, G. Endo, G. Cheng, C. Atkeson, G. Zeglin

{"title":"Poincaré-基于地图的两足行走强化学习","authors":"J. Morimoto, J. Nakanishi, G. Endo, G. Cheng, C. Atkeson, G. Zeglin","doi":"10.1109/ROBOT.2005.1570469","DOIUrl":null,"url":null,"abstract":"We propose a model-based reinforcement learning algorithm for biped walking in which the robot learns to appropriately modulate an observed walking pattern. Via-points are detected from the observed walking trajectories using the minimum jerk criterion. The learning algorithm modulates the via-points as control actions to improve walking trajectories. This decision is based on a learned model of the Poincaré map of the periodic walking pattern. The model maps from a state in the single support phase and the control actions to a state in the next single support phase. We applied this approach to both a simulated robot model and an actual biped robot. We show that successful walking policies are acquired.","PeriodicalId":350878,"journal":{"name":"Proceedings of the 2005 IEEE International Conference on Robotics and Automation","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"58","resultStr":"{\"title\":\"Poincaré-Map-Based Reinforcement Learning For Biped Walking\",\"authors\":\"J. Morimoto, J. Nakanishi, G. Endo, G. Cheng, C. Atkeson, G. Zeglin\",\"doi\":\"10.1109/ROBOT.2005.1570469\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a model-based reinforcement learning algorithm for biped walking in which the robot learns to appropriately modulate an observed walking pattern. Via-points are detected from the observed walking trajectories using the minimum jerk criterion. The learning algorithm modulates the via-points as control actions to improve walking trajectories. This decision is based on a learned model of the Poincaré map of the periodic walking pattern. The model maps from a state in the single support phase and the control actions to a state in the next single support phase. We applied this approach to both a simulated robot model and an actual biped robot. We show that successful walking policies are acquired.\",\"PeriodicalId\":350878,\"journal\":{\"name\":\"Proceedings of the 2005 IEEE International Conference on Robotics and Automation\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-04-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"58\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2005 IEEE International Conference on Robotics and Automation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ROBOT.2005.1570469\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2005 IEEE International Conference on Robotics and Automation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROBOT.2005.1570469","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 58

摘要

我们提出了一种基于模型的两足步行强化学习算法，其中机器人学习适当地调节观察到的步行模式。通过点检测从观察到的行走轨迹使用最小的震动准则。学习算法通过调节过点作为控制动作来改善行走轨迹。这个决定是基于周期性行走模式的庞加莱图的学习模型。模型从单个支持阶段的状态和控制动作映射到下一个单个支持阶段的状态。我们将这种方法应用于模拟机器人模型和实际的双足机器人。我们证明了成功的步行策略是获得的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Poincaré-Map-Based Reinforcement Learning For Biped Walking

We propose a model-based reinforcement learning algorithm for biped walking in which the robot learns to appropriately modulate an observed walking pattern. Via-points are detected from the observed walking trajectories using the minimum jerk criterion. The learning algorithm modulates the via-points as control actions to improve walking trajectories. This decision is based on a learned model of the Poincaré map of the periodic walking pattern. The model maps from a state in the single support phase and the control actions to a state in the next single support phase. We applied this approach to both a simulated robot model and an actual biped robot. We show that successful walking policies are acquired.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2005 IEEE International Conference on Robotics and Automation

自引率

0.00%

发文量