Task-Parameterized Dynamic Movement Primitives With Reinforcement Learning for Improved Motion Planning

IF 4.6 2区计算机科学 Q2 ROBOTICS

IEEE Robotics and Automation Letters Pub Date : 2025-04-14 DOI:10.1109/LRA.2025.3560876

Kaiqi Huang;Xiaochun Ji;Jianhua Su;Xiaoyi Qu

{"title":"Task-Parameterized Dynamic Movement Primitives With Reinforcement Learning for Improved Motion Planning","authors":"Kaiqi Huang;Xiaochun Ji;Jianhua Su;Xiaoyi Qu","doi":"10.1109/LRA.2025.3560876","DOIUrl":null,"url":null,"abstract":"Online trajectory planning in unstructured environments poses significant challenges for mobile robots, particularly when navigating complex obstacles. Traditional learning-from-demonstration (LfD) methods depend on offline datasets, limiting their ability to adapt to varying obstacle shapes and dynamic conditions. To address these limitations, we propose a novel motion planning framework that combines global trajectory generation with local adaptability. Dynamic Movement Primitives (DMPs) are employed to generate global trajectories based on demonstrations, while Task-Parameterized Potential Fields (TPPFs) enhance local adaptability. The Policy Improvement through Path Integrals (PI<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>) algorithm is utilized to optimize model parameters. The TPPF framework consists of two key components: (a) an obstacle avoidance field, which accounts for the robot's size, obstacle dimensions, and relative distances, allowing effective volumetric avoidance without extensive modeling; and (b) an attractive field, which directs the robot toward task-specific goals while steering it away from undesirable paths. By leveraging the PI<inline-formula><tex-math>$^{2}$</tex-math></inline-formula> algorithm, model parameters are optimized to produce trajectories that preserve the characteristics of demonstrated motions, while improving obstacle avoidance and task-oriented navigation. Experiments conducted in both simulations and dynamic real-world scenarios validate the proposed framework's effectiveness, demonstrating smoother trajectories and enhanced obstacle avoidance compared to baseline approaches.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5457-5464"},"PeriodicalIF":4.6000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10964858/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Online trajectory planning in unstructured environments poses significant challenges for mobile robots, particularly when navigating complex obstacles. Traditional learning-from-demonstration (LfD) methods depend on offline datasets, limiting their ability to adapt to varying obstacle shapes and dynamic conditions. To address these limitations, we propose a novel motion planning framework that combines global trajectory generation with local adaptability. Dynamic Movement Primitives (DMPs) are employed to generate global trajectories based on demonstrations, while Task-Parameterized Potential Fields (TPPFs) enhance local adaptability. The Policy Improvement through Path Integrals (PI

$^{2}$

) algorithm is utilized to optimize model parameters. The TPPF framework consists of two key components: (a) an obstacle avoidance field, which accounts for the robot's size, obstacle dimensions, and relative distances, allowing effective volumetric avoidance without extensive modeling; and (b) an attractive field, which directs the robot toward task-specific goals while steering it away from undesirable paths. By leveraging the PI

$^{2}$

algorithm, model parameters are optimized to produce trajectories that preserve the characteristics of demonstrated motions, while improving obstacle avoidance and task-oriented navigation. Experiments conducted in both simulations and dynamic real-world scenarios validate the proposed framework's effectiveness, demonstrating smoother trajectories and enhanced obstacle avoidance compared to baseline approaches.

查看原文本刊更多论文

基于强化学习的任务参数化动态运动原语改进运动规划

非结构化环境中的在线轨迹规划对移动机器人提出了重大挑战，特别是在导航复杂障碍物时。传统的从演示中学习（LfD）方法依赖于离线数据集，限制了它们适应不同障碍物形状和动态条件的能力。为了解决这些限制，我们提出了一种新的运动规划框架，将全局轨迹生成与局部适应性相结合。动态运动原语（Dynamic Movement Primitives, dmp）用于生成基于演示的全局轨迹，而任务参数化势场（Task-Parameterized Potential Fields, tppf）用于增强局部适应性。采用路径积分策略改进算法（PI$^{2}$）对模型参数进行优化。TPPF框架由两个关键部分组成：(a)避障场，该避障场考虑了机器人的尺寸、障碍物尺寸和相对距离，允许在不进行大量建模的情况下进行有效的体积避障；(b)一个有吸引力的领域，引导机器人走向特定的任务目标，同时引导它远离不受欢迎的路径。通过利用PI$^{2}$算法，优化模型参数以产生保留演示运动特征的轨迹，同时提高障碍物回避和任务导向导航。在模拟和动态现实场景中进行的实验验证了所提出框架的有效性，与基线方法相比，显示出更平滑的轨迹和增强的避障能力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Robotics and Automation Letters Computer Science-Computer Science Applications

CiteScore

9.60

自引率

15.40%

发文量

1428

期刊介绍： The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.