基于两步深度强化q学习的协同wpcn中继选择

Gulnur Tolebi, T. Tsiftsis, G. Nauryzbayev
{"title":"基于两步深度强化q学习的协同wpcn中继选择","authors":"Gulnur Tolebi, T. Tsiftsis, G. Nauryzbayev","doi":"10.1109/BalkanCom58402.2023.10167871","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an intelligent relay selection scheme employing deep reinforcement learning for a wireless powered cooperative network. We formulate the given problem as a Markov decision process with an unknown transitional probability between states. Therefore, a model-free off-policy relay selection model is proposed. The given model was deployed using a deep Q-network, with an updated relay selection process. Using channel characteristics, we find inaccessible nodes to form a pool of relays available for transmission and encourage the neural network to choose them. In addition, we propose a novel reward policy to train the model that is based on stored energy levels on the relays and promotes the system to expend energy. We numerically quantity the network performance in terms of outage probability and energy outage probability and compare them with the basic Q-learning.","PeriodicalId":363999,"journal":{"name":"2023 International Balkan Conference on Communications and Networking (BalkanCom)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two-Step Deep Reinforcement Q-Learning based Relay Selection in Cooperative WPCNs\",\"authors\":\"Gulnur Tolebi, T. Tsiftsis, G. Nauryzbayev\",\"doi\":\"10.1109/BalkanCom58402.2023.10167871\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an intelligent relay selection scheme employing deep reinforcement learning for a wireless powered cooperative network. We formulate the given problem as a Markov decision process with an unknown transitional probability between states. Therefore, a model-free off-policy relay selection model is proposed. The given model was deployed using a deep Q-network, with an updated relay selection process. Using channel characteristics, we find inaccessible nodes to form a pool of relays available for transmission and encourage the neural network to choose them. In addition, we propose a novel reward policy to train the model that is based on stored energy levels on the relays and promotes the system to expend energy. We numerically quantity the network performance in terms of outage probability and energy outage probability and compare them with the basic Q-learning.\",\"PeriodicalId\":363999,\"journal\":{\"name\":\"2023 International Balkan Conference on Communications and Networking (BalkanCom)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Balkan Conference on Communications and Networking (BalkanCom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BalkanCom58402.2023.10167871\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Balkan Conference on Communications and Networking (BalkanCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BalkanCom58402.2023.10167871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

在本文中,我们提出了一种基于深度强化学习的无线供电合作网络智能中继选择方案。我们将给定问题表述为具有未知状态间转移概率的马尔可夫决策过程。为此,提出了一种无模型脱策略中继选择模型。给定的模型使用深度q -网络进行部署,具有更新的中继选择过程。利用信道特性,我们找到不可访问的节点,形成一个可用于传输的中继池,并鼓励神经网络选择它们。此外,我们提出了一种新的奖励策略来训练基于继电器上存储的能量水平的模型,并促进系统消耗能量。我们用中断概率和能量中断概率对网络性能进行数值量化,并将其与基本的q -学习进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Two-Step Deep Reinforcement Q-Learning based Relay Selection in Cooperative WPCNs
In this paper, we propose an intelligent relay selection scheme employing deep reinforcement learning for a wireless powered cooperative network. We formulate the given problem as a Markov decision process with an unknown transitional probability between states. Therefore, a model-free off-policy relay selection model is proposed. The given model was deployed using a deep Q-network, with an updated relay selection process. Using channel characteristics, we find inaccessible nodes to form a pool of relays available for transmission and encourage the neural network to choose them. In addition, we propose a novel reward policy to train the model that is based on stored energy levels on the relays and promotes the system to expend energy. We numerically quantity the network performance in terms of outage probability and energy outage probability and compare them with the basic Q-learning.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信