多机器人无碰撞导航问题的累积训练与迁移学习

Trung-Thanh Nguyen, Amartya Hatua, A. Sung
{"title":"多机器人无碰撞导航问题的累积训练与迁移学习","authors":"Trung-Thanh Nguyen, Amartya Hatua, A. Sung","doi":"10.1109/UEMCON47517.2019.8992945","DOIUrl":null,"url":null,"abstract":"Recently, the characteristics of robot autonomy, decentralized control, collective decision-making ability, high fault tolerance, etc. have significantly increased the applications of swarm robotics in targeted material delivery, precision farming, surveillance, defense and many other areas. In these multi-agent systems, safe collision avoidance is one of the most fundamental and important problems. Difference approaches, especially reinforcement learning, have been applied to solve this problem. This paper introduces a new cumulative learning approach which comprises of application of transfer learning with distributed multi-agent reinforcement learning techniques to solve collision-free navigation for swarm robotics. In our method, throughout the learning processes from the least complexity scenario to the most complex one, multiple agents can improve the shared policy through parameter sharing, reward shaping and multi-round multi-steps learning. We have adapted two policy gradient algorithms (TRPO and PPO) as the core of our distributed multiagent reinforcement learning method. The performance has shown that our new methodology can help reduce the training time and generate a robust navigation plan that can easily be generalized to complex in-door scenarios.","PeriodicalId":187022,"journal":{"name":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Cumulative Training and Transfer Learning for Multi-Robots Collision-Free Navigation Problems\",\"authors\":\"Trung-Thanh Nguyen, Amartya Hatua, A. Sung\",\"doi\":\"10.1109/UEMCON47517.2019.8992945\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recently, the characteristics of robot autonomy, decentralized control, collective decision-making ability, high fault tolerance, etc. have significantly increased the applications of swarm robotics in targeted material delivery, precision farming, surveillance, defense and many other areas. In these multi-agent systems, safe collision avoidance is one of the most fundamental and important problems. Difference approaches, especially reinforcement learning, have been applied to solve this problem. This paper introduces a new cumulative learning approach which comprises of application of transfer learning with distributed multi-agent reinforcement learning techniques to solve collision-free navigation for swarm robotics. In our method, throughout the learning processes from the least complexity scenario to the most complex one, multiple agents can improve the shared policy through parameter sharing, reward shaping and multi-round multi-steps learning. We have adapted two policy gradient algorithms (TRPO and PPO) as the core of our distributed multiagent reinforcement learning method. The performance has shown that our new methodology can help reduce the training time and generate a robust navigation plan that can easily be generalized to complex in-door scenarios.\",\"PeriodicalId\":187022,\"journal\":{\"name\":\"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)\",\"volume\":\"25 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UEMCON47517.2019.8992945\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 10th Annual Ubiquitous Computing, Electronics & Mobile Communication Conference (UEMCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UEMCON47517.2019.8992945","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

近年来,机器人的自主性、分散控制、集体决策能力、高容错性等特点显著增加了群体机器人在定向投送物资、精准农业、监控、国防等诸多领域的应用。在多智能体系统中,安全避碰是最基本、最重要的问题之一。差分方法,特别是强化学习,已经被应用于解决这个问题。将迁移学习与分布式多智能体强化学习技术相结合,提出了一种新的累积学习方法来解决群体机器人的无碰撞导航问题。在我们的方法中,在从最小复杂度场景到最复杂场景的整个学习过程中,多个智能体可以通过参数共享、奖励塑造和多轮多步学习来改进共享策略。我们采用了两种策略梯度算法(TRPO和PPO)作为分布式多智能体强化学习方法的核心。实验结果表明,我们的新方法可以帮助减少训练时间,并生成一个鲁棒的导航计划,可以很容易地推广到复杂的室内场景。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Cumulative Training and Transfer Learning for Multi-Robots Collision-Free Navigation Problems
Recently, the characteristics of robot autonomy, decentralized control, collective decision-making ability, high fault tolerance, etc. have significantly increased the applications of swarm robotics in targeted material delivery, precision farming, surveillance, defense and many other areas. In these multi-agent systems, safe collision avoidance is one of the most fundamental and important problems. Difference approaches, especially reinforcement learning, have been applied to solve this problem. This paper introduces a new cumulative learning approach which comprises of application of transfer learning with distributed multi-agent reinforcement learning techniques to solve collision-free navigation for swarm robotics. In our method, throughout the learning processes from the least complexity scenario to the most complex one, multiple agents can improve the shared policy through parameter sharing, reward shaping and multi-round multi-steps learning. We have adapted two policy gradient algorithms (TRPO and PPO) as the core of our distributed multiagent reinforcement learning method. The performance has shown that our new methodology can help reduce the training time and generate a robust navigation plan that can easily be generalized to complex in-door scenarios.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信