Heasung Kim, W. Shin, Heecheol Yang, Nayoung Lee, Jungwoo Lee
{"title":"Rate Maximization with Reinforcement Learning for Time-Varying Energy Harvesting Broadcast Channels","authors":"Heasung Kim, W. Shin, Heecheol Yang, Nayoung Lee, Jungwoo Lee","doi":"10.1109/GLOBECOM38437.2019.9013583","DOIUrl":null,"url":null,"abstract":"In this paper, we consider a power allocation optimization technique for a time-varying fading broadcast channel in energy harvesting communication systems, in which a transmitter with a rechargeable battery transmits messages to receivers using the harvested energy. We first prove that the optimal online power allocation policy for the sum rate maximization of the transmitter is an increasing function of harvested energy, remaining battery, and each user's channel gain. We then construct an appropriate neural network by relying on increasing behavior of the optimal policy. This two-step approach, by using an effective function approximation as well as providing a fundamental guideline for neural network design, can prevent us from wasting the representational capacity of neural networks. On the basis of the neural network, we apply the policy gradient method to solve the power allocation problem. To validate the performance of our approach, we compare it with the closed-form the optimal policy in a partially observable Markov problem. Through further experiments, it is observed that our online solution achieves a performance close to the theoretical upper bound of the performance in a time-varying fading broadcast channel.","PeriodicalId":6868,"journal":{"name":"2019 IEEE Global Communications Conference (GLOBECOM)","volume":"45 1","pages":"1-6"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Global Communications Conference (GLOBECOM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM38437.2019.9013583","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
In this paper, we consider a power allocation optimization technique for a time-varying fading broadcast channel in energy harvesting communication systems, in which a transmitter with a rechargeable battery transmits messages to receivers using the harvested energy. We first prove that the optimal online power allocation policy for the sum rate maximization of the transmitter is an increasing function of harvested energy, remaining battery, and each user's channel gain. We then construct an appropriate neural network by relying on increasing behavior of the optimal policy. This two-step approach, by using an effective function approximation as well as providing a fundamental guideline for neural network design, can prevent us from wasting the representational capacity of neural networks. On the basis of the neural network, we apply the policy gradient method to solve the power allocation problem. To validate the performance of our approach, we compare it with the closed-form the optimal policy in a partially observable Markov problem. Through further experiments, it is observed that our online solution achieves a performance close to the theoretical upper bound of the performance in a time-varying fading broadcast channel.