Rate Maximization with Reinforcement Learning for Time-Varying Energy Harvesting Broadcast Channels

2019 IEEE Global Communications Conference (GLOBECOM) Pub Date : 2019-12-01 DOI:10.1109/GLOBECOM38437.2019.9013583

Heasung Kim, W. Shin, Heecheol Yang, Nayoung Lee, Jungwoo Lee

{"title":"Rate Maximization with Reinforcement Learning for Time-Varying Energy Harvesting Broadcast Channels","authors":"Heasung Kim, W. Shin, Heecheol Yang, Nayoung Lee, Jungwoo Lee","doi":"10.1109/GLOBECOM38437.2019.9013583","DOIUrl":null,"url":null,"abstract":"In this paper, we consider a power allocation optimization technique for a time-varying fading broadcast channel in energy harvesting communication systems, in which a transmitter with a rechargeable battery transmits messages to receivers using the harvested energy. We first prove that the optimal online power allocation policy for the sum rate maximization of the transmitter is an increasing function of harvested energy, remaining battery, and each user's channel gain. We then construct an appropriate neural network by relying on increasing behavior of the optimal policy. This two-step approach, by using an effective function approximation as well as providing a fundamental guideline for neural network design, can prevent us from wasting the representational capacity of neural networks. On the basis of the neural network, we apply the policy gradient method to solve the power allocation problem. To validate the performance of our approach, we compare it with the closed-form the optimal policy in a partially observable Markov problem. Through further experiments, it is observed that our online solution achieves a performance close to the theoretical upper bound of the performance in a time-varying fading broadcast channel.","PeriodicalId":6868,"journal":{"name":"2019 IEEE Global Communications Conference (GLOBECOM)","volume":"45 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Global Communications Conference (GLOBECOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM38437.2019.9013583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

Abstract

In this paper, we consider a power allocation optimization technique for a time-varying fading broadcast channel in energy harvesting communication systems, in which a transmitter with a rechargeable battery transmits messages to receivers using the harvested energy. We first prove that the optimal online power allocation policy for the sum rate maximization of the transmitter is an increasing function of harvested energy, remaining battery, and each user's channel gain. We then construct an appropriate neural network by relying on increasing behavior of the optimal policy. This two-step approach, by using an effective function approximation as well as providing a fundamental guideline for neural network design, can prevent us from wasting the representational capacity of neural networks. On the basis of the neural network, we apply the policy gradient method to solve the power allocation problem. To validate the performance of our approach, we compare it with the closed-form the optimal policy in a partially observable Markov problem. Through further experiments, it is observed that our online solution achieves a performance close to the theoretical upper bound of the performance in a time-varying fading broadcast channel.

查看原文本刊更多论文

时变能量采集广播信道的强化学习速率最大化

在本文中，我们考虑了一种时变衰落广播信道的功率分配优化技术，该技术在能量收集通信系统中，使用可充电电池的发射机利用收集的能量向接收机发送消息。我们首先证明了发射机的最优在线功率分配策略是收获能量、剩余电池和每个用户信道增益的递增函数。然后根据最优策略的递增行为构造合适的神经网络。这种两步法，通过使用有效的函数近似，并为神经网络设计提供了基本的指导方针，可以防止我们浪费神经网络的表示能力。在神经网络的基础上，应用策略梯度方法求解电力分配问题。为了验证该方法的性能，我们将其与部分可观察马尔可夫问题的封闭最优策略进行了比较。通过进一步的实验，我们的在线方案在时变衰落广播信道中达到了接近理论性能上限的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 IEEE Global Communications Conference (GLOBECOM)

自引率

0.00%

发文量