Task-Parameterized Dynamic Movement Primitives With Reinforcement Learning for Improved Motion Planning

IF 4.6 2区 计算机科学 Q2 ROBOTICS
Kaiqi Huang;Xiaochun Ji;Jianhua Su;Xiaoyi Qu
{"title":"Task-Parameterized Dynamic Movement Primitives With Reinforcement Learning for Improved Motion Planning","authors":"Kaiqi Huang;Xiaochun Ji;Jianhua Su;Xiaoyi Qu","doi":"10.1109/LRA.2025.3560876","DOIUrl":null,"url":null,"abstract":"Online trajectory planning in unstructured environments poses significant challenges for mobile robots, particularly when navigating complex obstacles. Traditional learning-from-demonstration (LfD) methods depend on offline datasets, limiting their ability to adapt to varying obstacle shapes and dynamic conditions. To address these limitations, we propose a novel motion planning framework that combines global trajectory generation with local adaptability. Dynamic Movement Primitives (DMPs) are employed to generate global trajectories based on demonstrations, while Task-Parameterized Potential Fields (TPPFs) enhance local adaptability. The Policy Improvement through Path Integrals (PI<inline-formula><tex-math>$^{2}$</tex-math></inline-formula>) algorithm is utilized to optimize model parameters. The TPPF framework consists of two key components: (a) an obstacle avoidance field, which accounts for the robot's size, obstacle dimensions, and relative distances, allowing effective volumetric avoidance without extensive modeling; and (b) an attractive field, which directs the robot toward task-specific goals while steering it away from undesirable paths. By leveraging the PI<inline-formula><tex-math>$^{2}$</tex-math></inline-formula> algorithm, model parameters are optimized to produce trajectories that preserve the characteristics of demonstrated motions, while improving obstacle avoidance and task-oriented navigation. Experiments conducted in both simulations and dynamic real-world scenarios validate the proposed framework's effectiveness, demonstrating smoother trajectories and enhanced obstacle avoidance compared to baseline approaches.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 6","pages":"5457-5464"},"PeriodicalIF":4.6000,"publicationDate":"2025-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10964858/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Online trajectory planning in unstructured environments poses significant challenges for mobile robots, particularly when navigating complex obstacles. Traditional learning-from-demonstration (LfD) methods depend on offline datasets, limiting their ability to adapt to varying obstacle shapes and dynamic conditions. To address these limitations, we propose a novel motion planning framework that combines global trajectory generation with local adaptability. Dynamic Movement Primitives (DMPs) are employed to generate global trajectories based on demonstrations, while Task-Parameterized Potential Fields (TPPFs) enhance local adaptability. The Policy Improvement through Path Integrals (PI$^{2}$) algorithm is utilized to optimize model parameters. The TPPF framework consists of two key components: (a) an obstacle avoidance field, which accounts for the robot's size, obstacle dimensions, and relative distances, allowing effective volumetric avoidance without extensive modeling; and (b) an attractive field, which directs the robot toward task-specific goals while steering it away from undesirable paths. By leveraging the PI$^{2}$ algorithm, model parameters are optimized to produce trajectories that preserve the characteristics of demonstrated motions, while improving obstacle avoidance and task-oriented navigation. Experiments conducted in both simulations and dynamic real-world scenarios validate the proposed framework's effectiveness, demonstrating smoother trajectories and enhanced obstacle avoidance compared to baseline approaches.
基于强化学习的任务参数化动态运动原语改进运动规划
非结构化环境中的在线轨迹规划对移动机器人提出了重大挑战,特别是在导航复杂障碍物时。传统的从演示中学习(LfD)方法依赖于离线数据集,限制了它们适应不同障碍物形状和动态条件的能力。为了解决这些限制,我们提出了一种新的运动规划框架,将全局轨迹生成与局部适应性相结合。动态运动原语(Dynamic Movement Primitives, dmp)用于生成基于演示的全局轨迹,而任务参数化势场(Task-Parameterized Potential Fields, tppf)用于增强局部适应性。采用路径积分策略改进算法(PI$^{2}$)对模型参数进行优化。TPPF框架由两个关键部分组成:(a)避障场,该避障场考虑了机器人的尺寸、障碍物尺寸和相对距离,允许在不进行大量建模的情况下进行有效的体积避障;(b)一个有吸引力的领域,引导机器人走向特定的任务目标,同时引导它远离不受欢迎的路径。通过利用PI$^{2}$算法,优化模型参数以产生保留演示运动特征的轨迹,同时提高障碍物回避和任务导向导航。在模拟和动态现实场景中进行的实验验证了所提出框架的有效性,与基线方法相比,显示出更平滑的轨迹和增强的避障能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Robotics and Automation Letters
IEEE Robotics and Automation Letters Computer Science-Computer Science Applications
CiteScore
9.60
自引率
15.40%
发文量
1428
期刊介绍: The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信