Yixin Huang, Shufan Wu, Z. Mu, Xiangyu Long, Sunhao Chu, G. Zhao
{"title":"A Multi-agent Reinforcement Learning Method for Swarm Robots in Space Collaborative Exploration","authors":"Yixin Huang, Shufan Wu, Z. Mu, Xiangyu Long, Sunhao Chu, G. Zhao","doi":"10.1109/ICCAR49639.2020.9107997","DOIUrl":null,"url":null,"abstract":"Deep-space exploration missions are known as particularly challenging with high risk and cost, as they operate in environments with high uncertainty. The fault of exploration robot can even cause the whole mission to failure. One of the solutions is to use swarm robots to operate missions collaboratively. Compared with a single capable robot, a swarm of less sophisticated robots can cooperate on multiple and complex tasks. Reinforcement learning (RL) has made a variety of progress in multi-agent system autonomous cooperative control domains. In this paper, we construct a collaborative exploration scenario, where a multi-robot system explores an unknown Mars surface. Tasks are assigned to robots by human scientists and each robot takes optimal policies autonomously. The method used to train policies is a multi-agent deep deterministic policy gradient algorithm (MADDPG) and we design an experience sample optimizer to improve this algorithm. The results show that, with the increase of robots and targets number, this method is more efficient than traditional deep RL algorithm in a multi-agent collaborative exploration environment.","PeriodicalId":412255,"journal":{"name":"2020 6th International Conference on Control, Automation and Robotics (ICCAR)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 6th International Conference on Control, Automation and Robotics (ICCAR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAR49639.2020.9107997","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 14
Abstract
Deep-space exploration missions are known as particularly challenging with high risk and cost, as they operate in environments with high uncertainty. The fault of exploration robot can even cause the whole mission to failure. One of the solutions is to use swarm robots to operate missions collaboratively. Compared with a single capable robot, a swarm of less sophisticated robots can cooperate on multiple and complex tasks. Reinforcement learning (RL) has made a variety of progress in multi-agent system autonomous cooperative control domains. In this paper, we construct a collaborative exploration scenario, where a multi-robot system explores an unknown Mars surface. Tasks are assigned to robots by human scientists and each robot takes optimal policies autonomously. The method used to train policies is a multi-agent deep deterministic policy gradient algorithm (MADDPG) and we design an experience sample optimizer to improve this algorithm. The results show that, with the increase of robots and targets number, this method is more efficient than traditional deep RL algorithm in a multi-agent collaborative exploration environment.