Deep Reinforcement Learning of Cooperative Control with Four Robotic Agents by MADDPG

Zhaoyang Wang, Renzhuo Wan, Xi Gui, Guopeng Zhou
{"title":"Deep Reinforcement Learning of Cooperative Control with Four Robotic Agents by MADDPG","authors":"Zhaoyang Wang, Renzhuo Wan, Xi Gui, Guopeng Zhou","doi":"10.1109/icceic51584.2020.00061","DOIUrl":null,"url":null,"abstract":"Due to the nature of complexity, inflexibility and non-robustness of classical cooperative control algorithms, the deep reinforcement learning has been widely researched and applied in collective and continuous behaviour control. Especially for multi-agents in real world, acquiring a full view world with a quick learning is still a great challenge. Inspired by Policy Gradient (PG) and its successors, a toy model with multi-agents by four two-dimensional manipulators environment is built based on physics engine-based MuJoCo. With a modified deep deterministic policy gradient algorithm and different credit strategies for individual agent, the cooperation and competition behaviour to target location between agents are studied. The experimental results show that each robot can complete the task with a negligible convergence effect, indicating that the MADDPG algorithm has a good performance in a complex environment, and successfully learn the strategy of multi-agent collaboration. However, with the instability of the environment caused by the increase in the number of agents, deep reinforcement learning has certain difficulties in the joint action space.","PeriodicalId":135840,"journal":{"name":"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Computer Engineering and Intelligent Control (ICCEIC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/icceic51584.2020.00061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

Abstract

Due to the nature of complexity, inflexibility and non-robustness of classical cooperative control algorithms, the deep reinforcement learning has been widely researched and applied in collective and continuous behaviour control. Especially for multi-agents in real world, acquiring a full view world with a quick learning is still a great challenge. Inspired by Policy Gradient (PG) and its successors, a toy model with multi-agents by four two-dimensional manipulators environment is built based on physics engine-based MuJoCo. With a modified deep deterministic policy gradient algorithm and different credit strategies for individual agent, the cooperation and competition behaviour to target location between agents are studied. The experimental results show that each robot can complete the task with a negligible convergence effect, indicating that the MADDPG algorithm has a good performance in a complex environment, and successfully learn the strategy of multi-agent collaboration. However, with the instability of the environment caused by the increase in the number of agents, deep reinforcement learning has certain difficulties in the joint action space.
基于madpg的四机器人智能体协同控制深度强化学习
由于经典合作控制算法的复杂性、不灵活性和非鲁棒性,深度强化学习在集体和连续行为控制中得到了广泛的研究和应用。特别是对于现实世界中的多智能体来说,快速学习并获得全视图世界仍然是一个巨大的挑战。受策略梯度(PG)及其后续方法的启发,基于基于物理引擎的MuJoCo,建立了一个具有多智能体的玩具模型。利用改进的深度确定性策略梯度算法和不同的个体代理信用策略,研究了agent之间对目标位置的合作与竞争行为。实验结果表明,每个机器人都能以可以忽略不计的收敛效应完成任务,表明madpg算法在复杂环境下具有良好的性能,并成功地学习了多智能体协作策略。然而,随着智能体数量的增加导致环境的不稳定性,深度强化学习在联合动作空间中存在一定的困难。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信