通过值函数逼近实现在线多接触后退地平线规划

IF 10.5 1区计算机科学 Q1 ROBOTICS

IEEE Transactions on Robotics Pub Date : 2024-04-22 DOI:10.1109/TRO.2024.3392154

Jiayi Wang;Sanghyun Kim;Teguh Santoso Lembono;Wenqian Du;Jaehyun Shim;Saeid Samadi;Ke Wang;Vladimir Ivan;Sylvain Calinon;Sethu Vijayakumar;Steve Tonneau

{"title":"通过值函数逼近实现在线多接触后退地平线规划","authors":"Jiayi Wang;Sanghyun Kim;Teguh Santoso Lembono;Wenqian Du;Jaehyun Shim;Saeid Samadi;Ke Wang;Vladimir Ivan;Sylvain Calinon;Sethu Vijayakumar;Steve Tonneau","doi":"10.1109/TRO.2024.3392154","DOIUrl":null,"url":null,"abstract":"Planning multicontact motions in a receding horizon fashion requires a value function to guide the planning with respect to the future, e.g., building momentum to traverse large obstacles. Traditionally, the value function is approximated by computing trajectories in a prediction horizon (never executed) that foresees the future beyond the execution horizon. However, given the nonconvex dynamics of multicontact motions, this approach is computationally expensive. To enable online receding horizon planning (RHP) of multicontact motions, we find efficient approximations of the value function. Specifically, we propose a trajectory-based and a learning-based approach. In the former, namely RHP with multiple levels of model fidelity, we approximate the value function by computing the prediction horizon with a convex relaxed model. In the latter, namely locally guided RHP, we learn an oracle to predict local objectives for locomotion tasks, and we use these local objectives to construct local value functions for guiding a short-horizon RHP. We evaluate both approaches in simulation by planning centroidal trajectories of a humanoid robot walking on moderate slopes, and on large slopes where the robot cannot maintain static balance. Our results show that locally guided RHP achieves the best computation efficiency (95%–98.6% cycles converge online). This computation advantage enables us to demonstrate online RHP of our real-world humanoid robot Talos walking in dynamic environments that change on-the-fly.","PeriodicalId":50388,"journal":{"name":"IEEE Transactions on Robotics","volume":"40 ","pages":"2791-2810"},"PeriodicalIF":10.5000,"publicationDate":"2024-04-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Online Multicontact Receding Horizon Planning via Value Function Approximation\",\"authors\":\"Jiayi Wang;Sanghyun Kim;Teguh Santoso Lembono;Wenqian Du;Jaehyun Shim;Saeid Samadi;Ke Wang;Vladimir Ivan;Sylvain Calinon;Sethu Vijayakumar;Steve Tonneau\",\"doi\":\"10.1109/TRO.2024.3392154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Planning multicontact motions in a receding horizon fashion requires a value function to guide the planning with respect to the future, e.g., building momentum to traverse large obstacles. Traditionally, the value function is approximated by computing trajectories in a prediction horizon (never executed) that foresees the future beyond the execution horizon. However, given the nonconvex dynamics of multicontact motions, this approach is computationally expensive. To enable online receding horizon planning (RHP) of multicontact motions, we find efficient approximations of the value function. Specifically, we propose a trajectory-based and a learning-based approach. In the former, namely RHP with multiple levels of model fidelity, we approximate the value function by computing the prediction horizon with a convex relaxed model. In the latter, namely locally guided RHP, we learn an oracle to predict local objectives for locomotion tasks, and we use these local objectives to construct local value functions for guiding a short-horizon RHP. We evaluate both approaches in simulation by planning centroidal trajectories of a humanoid robot walking on moderate slopes, and on large slopes where the robot cannot maintain static balance. Our results show that locally guided RHP achieves the best computation efficiency (95%–98.6% cycles converge online). This computation advantage enables us to demonstrate online RHP of our real-world humanoid robot Talos walking in dynamic environments that change on-the-fly.\",\"PeriodicalId\":50388,\"journal\":{\"name\":\"IEEE Transactions on Robotics\",\"volume\":\"40 \",\"pages\":\"2791-2810\"},\"PeriodicalIF\":10.5000,\"publicationDate\":\"2024-04-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10506550/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Robotics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10506550/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

以后退视界方式规划多接触运动需要一个价值函数来指导未来的规划，例如，建立穿越大型障碍物的动力。传统上，价值函数是通过计算预测视界（从未执行）中的轨迹来近似实现的，预测视界可预见执行视界之外的未来。然而，考虑到多接触运动的非凸动态性，这种方法的计算成本很高。为了实现多接触运动的在线后退视界规划（RHP），我们找到了值函数的有效近似值。具体来说，我们提出了一种基于轨迹的方法和一种基于学习的方法。在前者，即具有多级模型保真度的 RHP 中，我们通过计算具有凸松弛模型的预测视界来近似值函数。在后者，即局部引导的 RHP 中，我们学习一个预测运动任务局部目标的oracle，并利用这些局部目标构建局部值函数，以引导短视距 RHP。我们通过规划仿人机器人在中等斜坡和机器人无法保持静态平衡的大斜坡上行走的中心轨迹，对这两种方法进行了模拟评估。结果表明，局部引导的 RHP 计算效率最高（95%-98.6% 的周期在线收敛）。这种计算优势使我们能够演示仿人机器人 Talos 在动态环境中行走时的在线 RHP。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Online Multicontact Receding Horizon Planning via Value Function Approximation

Planning multicontact motions in a receding horizon fashion requires a value function to guide the planning with respect to the future, e.g., building momentum to traverse large obstacles. Traditionally, the value function is approximated by computing trajectories in a prediction horizon (never executed) that foresees the future beyond the execution horizon. However, given the nonconvex dynamics of multicontact motions, this approach is computationally expensive. To enable online receding horizon planning (RHP) of multicontact motions, we find efficient approximations of the value function. Specifically, we propose a trajectory-based and a learning-based approach. In the former, namely RHP with multiple levels of model fidelity, we approximate the value function by computing the prediction horizon with a convex relaxed model. In the latter, namely locally guided RHP, we learn an oracle to predict local objectives for locomotion tasks, and we use these local objectives to construct local value functions for guiding a short-horizon RHP. We evaluate both approaches in simulation by planning centroidal trajectories of a humanoid robot walking on moderate slopes, and on large slopes where the robot cannot maintain static balance. Our results show that locally guided RHP achieves the best computation efficiency (95%–98.6% cycles converge online). This computation advantage enables us to demonstrate online RHP of our real-world humanoid robot Talos walking in dynamic environments that change on-the-fly.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Robotics 工程技术-机器人学

CiteScore

14.90

自引率

5.10%

发文量

259

审稿时长

6.0 months

期刊介绍： The IEEE Transactions on Robotics (T-RO) is dedicated to publishing fundamental papers covering all facets of robotics, drawing on interdisciplinary approaches from computer science, control systems, electrical engineering, mathematics, mechanical engineering, and beyond. From industrial applications to service and personal assistants, surgical operations to space, underwater, and remote exploration, robots and intelligent machines play pivotal roles across various domains, including entertainment, safety, search and rescue, military applications, agriculture, and intelligent vehicles. Special emphasis is placed on intelligent machines and systems designed for unstructured environments, where a significant portion of the environment remains unknown and beyond direct sensing or control.