Multiple Robots Path Planning based on Reinforcement Learning for Object Transportation

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference Pub Date : 2022-12-17 DOI:10.1145/3582099.3582133

M. Parnichkun

{"title":"Multiple Robots Path Planning based on Reinforcement Learning for Object Transportation","authors":"M. Parnichkun","doi":"10.1145/3582099.3582133","DOIUrl":null,"url":null,"abstract":"This paper proposes reinforcement learning methods to perform an object transportation task for multiple robots. This task consists of two main subtasks, path planning and motion control task. Double deep Q-learning (DDQN) model is selected to achieve path planning for an unknown environment. To increase the capability of reinforcement learning model, semi-supervised method by A* algorithm is applied during the training process. In motion control task, reinforcement learning model is designed to control a movement of a differential wheeled mobile robot. The actions of mobile robot consisting of linear and angular velocities are computed by agent. The models for motion control task are separately trained for two different purposes. The first agent is trained to deal with the path following task and the other agent is trained to handle the point following task. The agent of the point following task is utilized to control the group of robots to move with a specific formation. Proximal policy optimization (PPO) is selected for the path following task and deep deterministic policy gradient (DDPG) is selected for the point following task. Eventually, the integration of the proposed reinforcement learning models can accomplish the object transportation task for multiple robots successfully both in simulation and experiment.","PeriodicalId":222372,"journal":{"name":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","volume":"121 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582099.3582133","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

This paper proposes reinforcement learning methods to perform an object transportation task for multiple robots. This task consists of two main subtasks, path planning and motion control task. Double deep Q-learning (DDQN) model is selected to achieve path planning for an unknown environment. To increase the capability of reinforcement learning model, semi-supervised method by A* algorithm is applied during the training process. In motion control task, reinforcement learning model is designed to control a movement of a differential wheeled mobile robot. The actions of mobile robot consisting of linear and angular velocities are computed by agent. The models for motion control task are separately trained for two different purposes. The first agent is trained to deal with the path following task and the other agent is trained to handle the point following task. The agent of the point following task is utilized to control the group of robots to move with a specific formation. Proximal policy optimization (PPO) is selected for the path following task and deep deterministic policy gradient (DDPG) is selected for the point following task. Eventually, the integration of the proposed reinforcement learning models can accomplish the object transportation task for multiple robots successfully both in simulation and experiment.

查看原文本刊更多论文

基于强化学习的多机器人物体运输路径规划

本文提出了一种用于多机器人物体运输任务的强化学习方法。该任务包括两个主要子任务:路径规划和运动控制任务。选择双深度q -学习(DDQN)模型来实现未知环境下的路径规划。为了提高强化学习模型的能力，在训练过程中采用了A*算法的半监督方法。在运动控制任务中，设计了强化学习模型来控制差动轮式移动机器人的运动。移动机器人的动作由线速度和角速度组成。运动控制任务的模型分别针对两种不同的目的进行训练。第一个智能体被训练来处理路径跟踪任务，另一个智能体被训练来处理点跟踪任务。利用点跟随任务的agent来控制机器人群以特定的队形移动。路径跟踪任务选择近端策略优化(PPO)，点跟踪任务选择深度确定性策略梯度(DDPG)。最终，综合上述强化学习模型，可以在仿真和实验中成功完成多机器人的物体搬运任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 2022 5th Artificial Intelligence and Cloud Computing Conference

自引率

0.00%

发文量