Improved DDPG-Based Path Planning for Mobile Robots

IF 1.5 4区计算机科学 Q3 COMPUTER SCIENCE, SOFTWARE ENGINEERING

Concurrency and Computation-Practice & Experience Pub Date : 2025-09-30 DOI:10.1002/cpe.70317

Xianyong Ruan, Du Jiang, Juntong Yun, Bo Tao, Yuanmin Xie, Baojia Chen, Meng Jia, Li Huang

{"title":"Improved DDPG-Based Path Planning for Mobile Robots","authors":"Xianyong Ruan, Du Jiang, Juntong Yun, Bo Tao, Yuanmin Xie, Baojia Chen, Meng Jia, Li Huang","doi":"10.1002/cpe.70317","DOIUrl":null,"url":null,"abstract":"<div>\n \n <p>With the rapid advancement of robotics technology, path planning has attracted extensive research attention. Reinforcement learning, owing to its ability to acquire optimal policies through continuous interaction with the environment, offers a promising solution for path planning in environments with incomplete or unknown information. However, reinforcement learning-based path planning methods often suffer from high training complexity and low utilization of effective samples. To address these issues, this paper proposes an improved deep reinforcement learning (DRL) algorithm. The proposed approach builds upon the deep deterministic policy gradient (DDPG) algorithm and incorporates a short-term goal planning strategy based on local perceptual information, which decomposes the global navigation task into multiple short-term subgoals, thereby reducing task complexity and enhancing learning efficiency. Furthermore, a reward function integrating the artificial potential field (APF) method is designed to improve obstacle avoidance capability. To tackle the low utilization of effective experiences in DDPG, a dual experience pool strategy is introduced to improve experience utilization efficiency and accelerate model training. The parameters for short-term goal selection are optimized through multiple comparative experiments, and the proposed method is evaluated against several DRL-based path planning approaches in a static environment. Experimental results demonstrate that the improved algorithm significantly accelerates convergence. Moreover, dynamic environment simulation experiments verify that the proposed algorithm can effectively avoid moving obstacles and achieve safe navigation to the target position.</p>\n </div>","PeriodicalId":55214,"journal":{"name":"Concurrency and Computation-Practice & Experience","volume":"37 25-26","pages":""},"PeriodicalIF":1.5000,"publicationDate":"2025-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Concurrency and Computation-Practice & Experience","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cpe.70317","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

With the rapid advancement of robotics technology, path planning has attracted extensive research attention. Reinforcement learning, owing to its ability to acquire optimal policies through continuous interaction with the environment, offers a promising solution for path planning in environments with incomplete or unknown information. However, reinforcement learning-based path planning methods often suffer from high training complexity and low utilization of effective samples. To address these issues, this paper proposes an improved deep reinforcement learning (DRL) algorithm. The proposed approach builds upon the deep deterministic policy gradient (DDPG) algorithm and incorporates a short-term goal planning strategy based on local perceptual information, which decomposes the global navigation task into multiple short-term subgoals, thereby reducing task complexity and enhancing learning efficiency. Furthermore, a reward function integrating the artificial potential field (APF) method is designed to improve obstacle avoidance capability. To tackle the low utilization of effective experiences in DDPG, a dual experience pool strategy is introduced to improve experience utilization efficiency and accelerate model training. The parameters for short-term goal selection are optimized through multiple comparative experiments, and the proposed method is evaluated against several DRL-based path planning approaches in a static environment. Experimental results demonstrate that the improved algorithm significantly accelerates convergence. Moreover, dynamic environment simulation experiments verify that the proposed algorithm can effectively avoid moving obstacles and achieve safe navigation to the target position.

Abstract Image

查看原文本刊更多论文

基于ddpg的移动机器人路径规划改进方法

随着机器人技术的飞速发展，路径规划受到了广泛的关注。由于强化学习能够通过与环境的持续交互获得最优策略，因此为信息不完整或未知环境中的路径规划提供了一个很有前途的解决方案。然而，基于强化学习的路径规划方法往往存在训练复杂度高、有效样本利用率低的问题。为了解决这些问题，本文提出了一种改进的深度强化学习（DRL）算法。该方法在深度确定性策略梯度（deep deterministic policy gradient， DDPG）算法的基础上，结合基于局部感知信息的短期目标规划策略，将全局导航任务分解为多个短期子目标，降低了任务复杂度，提高了学习效率。在此基础上，设计了一种集成人工势场（APF）方法的奖励函数，提高了机器人的避障能力。针对DDPG中有效经验利用率低的问题，引入双经验池策略，提高经验利用效率，加快模型训练速度。通过多次对比实验对短期目标选择参数进行了优化，并在静态环境下与几种基于drl的路径规划方法进行了比较。实验结果表明，改进后的算法显著加快了收敛速度。动态环境仿真实验验证了该算法能够有效避开移动障碍物，实现安全导航到目标位置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Concurrency and Computation-Practice & Experience 工程技术-计算机：理论方法

CiteScore

5.00

自引率

10.00%

发文量

664

审稿时长

9.6 months

期刊介绍： Concurrency and Computation: Practice and Experience (CCPE) publishes high-quality, original research papers, and authoritative research review papers, in the overlapping fields of: Parallel and distributed computing; High-performance computing; Computational and data science; Artificial intelligence and machine learning; Big data applications, algorithms, and systems; Network science; Ontologies and semantics; Security and privacy; Cloud/edge/fog computing; Green computing; and Quantum computing.