Deep reinforcement learning-based local path planning in dynamic environments for mobile robot

IF 6.1 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

Journal of King Saud University-Computer and Information Sciences Pub Date : 2024-12-01 DOI:10.1016/j.jksuci.2024.102254

Bodong Tao, Jae-Hoon Kim

{"title":"Deep reinforcement learning-based local path planning in dynamic environments for mobile robot","authors":"Bodong Tao, Jae-Hoon Kim","doi":"10.1016/j.jksuci.2024.102254","DOIUrl":null,"url":null,"abstract":"<div><div>Path planning for robots in dynamic environments is a challenging task, as it requires balancing obstacle avoidance, trajectory smoothness, and path length during real-time planning.This paper proposes an algorithm called Adaptive Soft Actor–Critic (ASAC), which combines the Soft Actor–Critic (SAC) algorithm, tile coding, and the Dynamic Window Approach (DWA) to enhance path planning capabilities. ASAC leverages SAC with an automatic entropy adjustment mechanism to balance exploration and exploitation, integrates tile coding for improved feature representation, and utilizes DWA to define the action space through parameters such as target heading, obstacle distance, and velocity In this framework, the action space is defined by DWA’s three weighting parameters: target heading deviation, distance to the nearest obstacle, and velocity. To facilitate the learning process, a non-sparse reward function is designed, incorporating factors such as Time-to-Collision (TTC), heading, and velocity. To validate the effectiveness of the algorithm, experiments were conducted in four different environments, and the algorithm was evaluated based on metrics such as trajectory deviation, smoothness, and time to reach the end point. The results demonstrate that ASAC outperforms existing algorithms in terms of trajectory smoothness, arrival time, and overall adaptability across various scenarios, effectively enabling path planning in dynamic environments.</div></div>","PeriodicalId":48547,"journal":{"name":"Journal of King Saud University-Computer and Information Sciences","volume":"36 10","pages":"Article 102254"},"PeriodicalIF":6.1000,"publicationDate":"2024-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of King Saud University-Computer and Information Sciences","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1319157824003434","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Path planning for robots in dynamic environments is a challenging task, as it requires balancing obstacle avoidance, trajectory smoothness, and path length during real-time planning.This paper proposes an algorithm called Adaptive Soft Actor–Critic (ASAC), which combines the Soft Actor–Critic (SAC) algorithm, tile coding, and the Dynamic Window Approach (DWA) to enhance path planning capabilities. ASAC leverages SAC with an automatic entropy adjustment mechanism to balance exploration and exploitation, integrates tile coding for improved feature representation, and utilizes DWA to define the action space through parameters such as target heading, obstacle distance, and velocity In this framework, the action space is defined by DWA’s three weighting parameters: target heading deviation, distance to the nearest obstacle, and velocity. To facilitate the learning process, a non-sparse reward function is designed, incorporating factors such as Time-to-Collision (TTC), heading, and velocity. To validate the effectiveness of the algorithm, experiments were conducted in four different environments, and the algorithm was evaluated based on metrics such as trajectory deviation, smoothness, and time to reach the end point. The results demonstrate that ASAC outperforms existing algorithms in terms of trajectory smoothness, arrival time, and overall adaptability across various scenarios, effectively enabling path planning in dynamic environments.

查看原文本刊更多论文

动态环境下基于深度强化学习的移动机器人局部路径规划

动态环境下的机器人路径规划是一项具有挑战性的任务，因为它需要在实时规划过程中平衡避障、轨迹平滑和路径长度。本文提出了一种自适应软行为者批评家（ASAC）算法，该算法结合了软行为者批评家（SAC）算法、贴图编码和动态窗口方法（DWA）来增强路径规划能力。ASAC利用带有自动熵调整机制的SAC来平衡探索和利用，集成瓦片编码来改进特征表示，并利用DWA通过目标航向、障碍物距离和速度等参数来定义动作空间。在该框架中，动作空间由DWA的三个加权参数：目标航向偏差、到最近障碍物的距离和速度来定义。为了方便学习过程，设计了一个非稀疏奖励函数，结合了碰撞时间（TTC）、航向和速度等因素。为了验证算法的有效性，在四种不同的环境下进行了实验，并根据轨迹偏差、平滑度和到达终点时间等指标对算法进行了评估。结果表明，ASAC在轨迹平滑度、到达时间和各种场景的整体适应性方面优于现有算法，有效地实现了动态环境下的路径规划。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of King Saud University-Computer and Information Sciences COMPUTER SCIENCE, INFORMATION SYSTEMS-

CiteScore

10.50

自引率

8.70%

发文量

656

审稿时长

29 days

期刊介绍： In 2022 the Journal of King Saud University - Computer and Information Sciences will become an author paid open access journal. Authors who submit their manuscript after October 31st 2021 will be asked to pay an Article Processing Charge (APC) after acceptance of their paper to make their work immediately, permanently, and freely accessible to all. The Journal of King Saud University Computer and Information Sciences is a refereed, international journal that covers all aspects of both foundations of computer and its practical applications.