基于两步深度强化q学习的协同wpcn中继选择

2023 International Balkan Conference on Communications and Networking (BalkanCom) Pub Date : 2023-06-05 DOI:10.1109/BalkanCom58402.2023.10167871

Gulnur Tolebi, T. Tsiftsis, G. Nauryzbayev

{"title":"基于两步深度强化q学习的协同wpcn中继选择","authors":"Gulnur Tolebi, T. Tsiftsis, G. Nauryzbayev","doi":"10.1109/BalkanCom58402.2023.10167871","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an intelligent relay selection scheme employing deep reinforcement learning for a wireless powered cooperative network. We formulate the given problem as a Markov decision process with an unknown transitional probability between states. Therefore, a model-free off-policy relay selection model is proposed. The given model was deployed using a deep Q-network, with an updated relay selection process. Using channel characteristics, we find inaccessible nodes to form a pool of relays available for transmission and encourage the neural network to choose them. In addition, we propose a novel reward policy to train the model that is based on stored energy levels on the relays and promotes the system to expend energy. We numerically quantity the network performance in terms of outage probability and energy outage probability and compare them with the basic Q-learning.","PeriodicalId":363999,"journal":{"name":"2023 International Balkan Conference on Communications and Networking (BalkanCom)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Two-Step Deep Reinforcement Q-Learning based Relay Selection in Cooperative WPCNs\",\"authors\":\"Gulnur Tolebi, T. Tsiftsis, G. Nauryzbayev\",\"doi\":\"10.1109/BalkanCom58402.2023.10167871\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an intelligent relay selection scheme employing deep reinforcement learning for a wireless powered cooperative network. We formulate the given problem as a Markov decision process with an unknown transitional probability between states. Therefore, a model-free off-policy relay selection model is proposed. The given model was deployed using a deep Q-network, with an updated relay selection process. Using channel characteristics, we find inaccessible nodes to form a pool of relays available for transmission and encourage the neural network to choose them. In addition, we propose a novel reward policy to train the model that is based on stored energy levels on the relays and promotes the system to expend energy. We numerically quantity the network performance in terms of outage probability and energy outage probability and compare them with the basic Q-learning.\",\"PeriodicalId\":363999,\"journal\":{\"name\":\"2023 International Balkan Conference on Communications and Networking (BalkanCom)\",\"volume\":\"55 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 International Balkan Conference on Communications and Networking (BalkanCom)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BalkanCom58402.2023.10167871\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 International Balkan Conference on Communications and Networking (BalkanCom)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BalkanCom58402.2023.10167871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

在本文中，我们提出了一种基于深度强化学习的无线供电合作网络智能中继选择方案。我们将给定问题表述为具有未知状态间转移概率的马尔可夫决策过程。为此，提出了一种无模型脱策略中继选择模型。给定的模型使用深度q -网络进行部署，具有更新的中继选择过程。利用信道特性，我们找到不可访问的节点，形成一个可用于传输的中继池，并鼓励神经网络选择它们。此外，我们提出了一种新的奖励策略来训练基于继电器上存储的能量水平的模型，并促进系统消耗能量。我们用中断概率和能量中断概率对网络性能进行数值量化，并将其与基本的q -学习进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Two-Step Deep Reinforcement Q-Learning based Relay Selection in Cooperative WPCNs

In this paper, we propose an intelligent relay selection scheme employing deep reinforcement learning for a wireless powered cooperative network. We formulate the given problem as a Markov decision process with an unknown transitional probability between states. Therefore, a model-free off-policy relay selection model is proposed. The given model was deployed using a deep Q-network, with an updated relay selection process. Using channel characteristics, we find inaccessible nodes to form a pool of relays available for transmission and encourage the neural network to choose them. In addition, we propose a novel reward policy to train the model that is based on stored energy levels on the relays and promotes the system to expend energy. We numerically quantity the network performance in terms of outage probability and energy outage probability and compare them with the basic Q-learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 International Balkan Conference on Communications and Networking (BalkanCom)

自引率

0.00%

发文量