未知环境下基于多经验池深度确定性策略梯度的移动机器人路径规划

IF 2.7 3区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

International Journal of Machine Learning and Cybernetics Pub Date : 2024-08-04 DOI:10.1007/s13042-024-02281-6

Linxin Wei, Quanxing Xu, Ziyu Hu

{"title":"未知环境下基于多经验池深度确定性策略梯度的移动机器人路径规划","authors":"Linxin Wei, Quanxing Xu, Ziyu Hu","doi":"10.1007/s13042-024-02281-6","DOIUrl":null,"url":null,"abstract":"<p>The path planning for unmanned mobile robots has always been a crucial issue, especially in unknown environments. Reinforcement learning widely used in path planning due to its ability to learn from unknown environments. But, in unknown environments, deep reinforcement learning algorithms have problems such as long training time and instability. In this article, improvements have been made to the deep deterministic policy gradient algorithm (DDPG) to address the aforementioned issues. Firstly, the experience pool is divided into different experience pools based on the difference between adjacent states; Secondly, experience is collected from various experience pools in different proportions for training, enabling the robot to achieve good obstacle avoidance ability; Finally, by designing a guided reward function, the convergence speed of the algorithm has been improved, and the robot can find the target point faster. The algorithm has been tested in practice and simulation, and the results show that it can enable robots to complete path planning tasks in complex unknown environments.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"7 1","pages":""},"PeriodicalIF":2.7000,"publicationDate":"2024-08-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Mobile robot path planning based on multi-experience pool deep deterministic policy gradient in unknown environment\",\"authors\":\"Linxin Wei, Quanxing Xu, Ziyu Hu\",\"doi\":\"10.1007/s13042-024-02281-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The path planning for unmanned mobile robots has always been a crucial issue, especially in unknown environments. Reinforcement learning widely used in path planning due to its ability to learn from unknown environments. But, in unknown environments, deep reinforcement learning algorithms have problems such as long training time and instability. In this article, improvements have been made to the deep deterministic policy gradient algorithm (DDPG) to address the aforementioned issues. Firstly, the experience pool is divided into different experience pools based on the difference between adjacent states; Secondly, experience is collected from various experience pools in different proportions for training, enabling the robot to achieve good obstacle avoidance ability; Finally, by designing a guided reward function, the convergence speed of the algorithm has been improved, and the robot can find the target point faster. The algorithm has been tested in practice and simulation, and the results show that it can enable robots to complete path planning tasks in complex unknown environments.</p>\",\"PeriodicalId\":51327,\"journal\":{\"name\":\"International Journal of Machine Learning and Cybernetics\",\"volume\":\"7 1\",\"pages\":\"\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-08-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Machine Learning and Cybernetics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s13042-024-02281-6\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Machine Learning and Cybernetics","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s13042-024-02281-6","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

无人移动机器人的路径规划一直是一个关键问题，尤其是在未知环境中。强化学习因其对未知环境的学习能力而被广泛应用于路径规划。但是，在未知环境中，深度强化学习算法存在训练时间长、不稳定等问题。本文针对上述问题，对深度确定性策略梯度算法（DDPG）进行了改进。首先，根据相邻状态的差异将经验池划分为不同的经验池；其次，从不同的经验池中收集不同比例的经验进行训练，使机器人获得良好的避障能力；最后，通过设计引导奖励函数，提高了算法的收敛速度，使机器人能更快地找到目标点。该算法经过实践和仿真测试，结果表明它能使机器人在复杂的未知环境中完成路径规划任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

Mobile robot path planning based on multi-experience pool deep deterministic policy gradient in unknown environment

查看原文本刊更多论文

Mobile robot path planning based on multi-experience pool deep deterministic policy gradient in unknown environment

The path planning for unmanned mobile robots has always been a crucial issue, especially in unknown environments. Reinforcement learning widely used in path planning due to its ability to learn from unknown environments. But, in unknown environments, deep reinforcement learning algorithms have problems such as long training time and instability. In this article, improvements have been made to the deep deterministic policy gradient algorithm (DDPG) to address the aforementioned issues. Firstly, the experience pool is divided into different experience pools based on the difference between adjacent states; Secondly, experience is collected from various experience pools in different proportions for training, enabling the robot to achieve good obstacle avoidance ability; Finally, by designing a guided reward function, the convergence speed of the algorithm has been improved, and the robot can find the target point faster. The algorithm has been tested in practice and simulation, and the results show that it can enable robots to complete path planning tasks in complex unknown environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Machine Learning and Cybernetics COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-

CiteScore

7.90

自引率

10.70%

发文量

225

期刊介绍： Cybernetics is concerned with describing complex interactions and interrelationships between systems which are omnipresent in our daily life. Machine Learning discovers fundamental functional relationships between variables and ensembles of variables in systems. The merging of the disciplines of Machine Learning and Cybernetics is aimed at the discovery of various forms of interaction between systems through diverse mechanisms of learning from data. The International Journal of Machine Learning and Cybernetics (IJMLC) focuses on the key research problems emerging at the junction of machine learning and cybernetics and serves as a broad forum for rapid dissemination of the latest advancements in the area. The emphasis of IJMLC is on the hybrid development of machine learning and cybernetics schemes inspired by different contributing disciplines such as engineering, mathematics, cognitive sciences, and applications. New ideas, design alternatives, implementations and case studies pertaining to all the aspects of machine learning and cybernetics fall within the scope of the IJMLC. Key research areas to be covered by the journal include: Machine Learning for modeling interactions between systems Pattern Recognition technology to support discovery of system-environment interaction Control of system-environment interactions Biochemical interaction in biological and biologically-inspired systems Learning for improvement of communication schemes between systems