多智能体强化学习在无人机网络中的应用

2019 IEEE International Conference on Communications Workshops (ICC Workshops) Pub Date : 2019-05-20 DOI:10.1109/ICCW.2019.8756984

Jingjing Cui, Yuanwei Liu, A. Nallanathan

{"title":"多智能体强化学习在无人机网络中的应用","authors":"Jingjing Cui, Yuanwei Liu, A. Nallanathan","doi":"10.1109/ICCW.2019.8756984","DOIUrl":null,"url":null,"abstract":"This article investigates autonomous resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. To model the uncertainty of environments, we formulate the long-term resource allocation problem as a stochastic game, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Furthermore, we propose a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs.","PeriodicalId":426086,"journal":{"name":"2019 IEEE International Conference on Communications Workshops (ICC Workshops)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":"{\"title\":\"The Application of Multi-Agent Reinforcement Learning in UAV Networks\",\"authors\":\"Jingjing Cui, Yuanwei Liu, A. Nallanathan\",\"doi\":\"10.1109/ICCW.2019.8756984\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This article investigates autonomous resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. To model the uncertainty of environments, we formulate the long-term resource allocation problem as a stochastic game, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Furthermore, we propose a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs.\",\"PeriodicalId\":426086,\"journal\":{\"name\":\"2019 IEEE International Conference on Communications Workshops (ICC Workshops)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"20\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Conference on Communications Workshops (ICC Workshops)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCW.2019.8756984\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Communications Workshops (ICC Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCW.2019.8756984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

摘要

本文以长期回报最大化为目标，研究了多无人机通信网络的自主资源分配。为了模拟环境的不确定性，我们将长期资源分配问题描述为一个随机博弈，其中每架无人机都成为一个学习代理，每个资源分配解决方案对应于无人机所采取的一个行动。此外，我们提出了一个多智能体强化学习(MARL)框架，每个智能体通过学习根据其局部观察发现其最佳策略。更具体地说，我们提出了一种智能体独立的方法，所有智能体独立地执行决策算法，但基于q学习共享一个共同的结构。仿真结果表明，与无人机之间完全信息交换的情况相比，所提出的MARL算法具有较好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

The Application of Multi-Agent Reinforcement Learning in UAV Networks

This article investigates autonomous resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. To model the uncertainty of environments, we formulate the long-term resource allocation problem as a stochastic game, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Furthermore, we propose a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE International Conference on Communications Workshops (ICC Workshops)

自引率

0.00%

发文量