{"title":"The Application of Multi-Agent Reinforcement Learning in UAV Networks","authors":"Jingjing Cui, Yuanwei Liu, A. Nallanathan","doi":"10.1109/ICCW.2019.8756984","DOIUrl":null,"url":null,"abstract":"This article investigates autonomous resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. To model the uncertainty of environments, we formulate the long-term resource allocation problem as a stochastic game, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Furthermore, we propose a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs.","PeriodicalId":426086,"journal":{"name":"2019 IEEE International Conference on Communications Workshops (ICC Workshops)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Conference on Communications Workshops (ICC Workshops)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCW.2019.8756984","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
This article investigates autonomous resource allocation of multiple UAVs enabled communication networks with the goal of maximizing long-term rewards. To model the uncertainty of environments, we formulate the long-term resource allocation problem as a stochastic game, where each UAV becomes a learning agent and each resource allocation solution corresponds to an action taken by the UAVs. Furthermore, we propose a multi-agent reinforcement learning (MARL) framework that each agent discovers its best strategy according to its local observations using learning. More specifically, we propose an agent-independent method, for which all agents conduct a decision algorithm independently but share a common structure based on Q-learning. Finally, simulation results reveal that the proposed MARL algorithm provides acceptable performance compared to the case with complete information exchanges among UAVs.