Partially Observable Multi-Agent Deep Reinforcement Learning for Cognitive Resource Management

Ning Yang, Haijun Zhang, R. Berry
{"title":"Partially Observable Multi-Agent Deep Reinforcement Learning for Cognitive Resource Management","authors":"Ning Yang, Haijun Zhang, R. Berry","doi":"10.1109/GLOBECOM42002.2020.9322150","DOIUrl":null,"url":null,"abstract":"In this paper, the problem of dynamic resource management in a cognitive radio network (CRN) with multiple primary users (PUs), multiple secondary users (SUs), and multiple channels is investigated. An optimization problem is formulated as a multi-agent partially observable Markov decision process (POMDP) problem in a dynamic and not fully observable environment. We consider using deep reinforcement learning (DRL) to address this problem. Based on the channel occupancy of PUs, a multi-agent deep Q-network (DQN)-based dynamic joint spectrum access and mode selection (SAMS) scheme is proposed for the SUs in the partially observable environment. The current observation of each SU is mapped to a suitable action. Each secondary user (SU) takes its own decision without exchanging information with other SUs. It seeks to maximize the total sum rate. Simulation results verify the effectiveness of our proposed schemes.","PeriodicalId":12759,"journal":{"name":"GLOBECOM 2020 - 2020 IEEE Global Communications Conference","volume":"1 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GLOBECOM 2020 - 2020 IEEE Global Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM42002.2020.9322150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9

Abstract

In this paper, the problem of dynamic resource management in a cognitive radio network (CRN) with multiple primary users (PUs), multiple secondary users (SUs), and multiple channels is investigated. An optimization problem is formulated as a multi-agent partially observable Markov decision process (POMDP) problem in a dynamic and not fully observable environment. We consider using deep reinforcement learning (DRL) to address this problem. Based on the channel occupancy of PUs, a multi-agent deep Q-network (DQN)-based dynamic joint spectrum access and mode selection (SAMS) scheme is proposed for the SUs in the partially observable environment. The current observation of each SU is mapped to a suitable action. Each secondary user (SU) takes its own decision without exchanging information with other SUs. It seeks to maximize the total sum rate. Simulation results verify the effectiveness of our proposed schemes.
面向认知资源管理的部分可观察多智能体深度强化学习
研究了具有多个主用户、多个从用户和多个信道的认知无线网络(CRN)中的动态资源管理问题。将优化问题表述为动态非完全可观察环境下的多智能体部分可观察马尔可夫决策过程问题。我们考虑使用深度强化学习(DRL)来解决这个问题。在部分可观测环境下,基于pu的信道占用,提出了一种基于多智能体深度q网络(DQN)的su动态联合频谱接入和模式选择(SAMS)方案。每个SU的当前观测被映射到一个合适的动作。每个辅助用户(SU)做出自己的决定,而不与其他SU交换信息。它寻求最大限度地提高总利率。仿真结果验证了所提方案的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信