机器人导航中协作多智能体q -学习算法的纠缠启发行为选择和知识共享方案

2020 10th International Conference on Computer and Knowledge Engineering (ICCKE) Pub Date : 2020-10-29 DOI:10.1109/ICCKE50421.2020.9303636

Mohammad Hasan Karami, Hossein Aghababa, A. Keyhanipour

{"title":"机器人导航中协作多智能体q -学习算法的纠缠启发行为选择和知识共享方案","authors":"Mohammad Hasan Karami, Hossein Aghababa, A. Keyhanipour","doi":"10.1109/ICCKE50421.2020.9303636","DOIUrl":null,"url":null,"abstract":"Multi-agent reinforcement learning, especially learning in unknown complex environments, requires new algorithms. In this work, our focus is on adopting the concept of the quantum entanglement phenomena to the action selection procedure of multi-agent Q-learning, aiming to enhance the learning speed, collision avoidance, and also providing full coverage of the environment. The exploration procedure is exclusively induced by a memory-based probabilistic sequential action selection method acting as a knowledge hub, shared among the agents, which is the central pillar of this work. This causes enhancing the parallelism of the learning process, plus, building an effective yet simple communicating-bridge between the learning agents; that is, they can signal and guide one another through sharing their gained experience and knowledge in order not to repeat the same mistake that the other agents have already run into. The simulation results demonstrated the effectiveness of our proposed algorithm in terms of reducing the learning time, significant reduction of collision occurrence, and providing full coverage of big complex clutter environments.","PeriodicalId":402043,"journal":{"name":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"An Entanglement-Inspired Action Selection and Knowledge Sharing Scheme for Cooperative Multi-agent Q-Learning Algorithm used in Robot Navigation\",\"authors\":\"Mohammad Hasan Karami, Hossein Aghababa, A. Keyhanipour\",\"doi\":\"10.1109/ICCKE50421.2020.9303636\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-agent reinforcement learning, especially learning in unknown complex environments, requires new algorithms. In this work, our focus is on adopting the concept of the quantum entanglement phenomena to the action selection procedure of multi-agent Q-learning, aiming to enhance the learning speed, collision avoidance, and also providing full coverage of the environment. The exploration procedure is exclusively induced by a memory-based probabilistic sequential action selection method acting as a knowledge hub, shared among the agents, which is the central pillar of this work. This causes enhancing the parallelism of the learning process, plus, building an effective yet simple communicating-bridge between the learning agents; that is, they can signal and guide one another through sharing their gained experience and knowledge in order not to repeat the same mistake that the other agents have already run into. The simulation results demonstrated the effectiveness of our proposed algorithm in terms of reducing the learning time, significant reduction of collision occurrence, and providing full coverage of big complex clutter environments.\",\"PeriodicalId\":402043,\"journal\":{\"name\":\"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCKE50421.2020.9303636\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCKE50421.2020.9303636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

多智能体强化学习，特别是在未知复杂环境下的学习，需要新的算法。在这项工作中，我们的重点是将量子纠缠现象的概念引入到多智能体Q-learning的动作选择过程中，旨在提高学习速度，避免碰撞，并提供对环境的全覆盖。探索过程完全由基于记忆的概率顺序动作选择方法诱导，作为智能体之间共享的知识中心，这是本工作的核心支柱。这可以增强学习过程的并行性，并且在学习代理之间建立有效而简单的沟通桥梁;也就是说，他们可以通过分享他们获得的经验和知识来相互发出信号和指导，以避免重复其他代理已经遇到的同样的错误。仿真结果表明，本文提出的算法在减少学习时间、显著减少碰撞发生、提供大复杂杂波环境全覆盖等方面是有效的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Entanglement-Inspired Action Selection and Knowledge Sharing Scheme for Cooperative Multi-agent Q-Learning Algorithm used in Robot Navigation

Multi-agent reinforcement learning, especially learning in unknown complex environments, requires new algorithms. In this work, our focus is on adopting the concept of the quantum entanglement phenomena to the action selection procedure of multi-agent Q-learning, aiming to enhance the learning speed, collision avoidance, and also providing full coverage of the environment. The exploration procedure is exclusively induced by a memory-based probabilistic sequential action selection method acting as a knowledge hub, shared among the agents, which is the central pillar of this work. This causes enhancing the parallelism of the learning process, plus, building an effective yet simple communicating-bridge between the learning agents; that is, they can signal and guide one another through sharing their gained experience and knowledge in order not to repeat the same mistake that the other agents have already run into. The simulation results demonstrated the effectiveness of our proposed algorithm in terms of reducing the learning time, significant reduction of collision occurrence, and providing full coverage of big complex clutter environments.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 10th International Conference on Computer and Knowledge Engineering (ICCKE)

自引率

0.00%

发文量