Fang Ye, Yinjie Zhang, Yibing Li, T. Jiang, Yingsong Li
{"title":"认知网络中基于改进奖励函数的深度Q网络功率控制","authors":"Fang Ye, Yinjie Zhang, Yibing Li, T. Jiang, Yingsong Li","doi":"10.23919/USNC/URSI49741.2020.9321658","DOIUrl":null,"url":null,"abstract":"This paper aims to design an appropriate power control policy of the secondary user (SU) to share the spectrum with the primary user without harmful interference. With dynamic spectrum environment, we develop a power control policy based on deep reinforcement learning with Deep Q network (DQN) that the secondary can intelligently adjust his transmit power. And reward function is properly designed to avoid the sparse reward problem which may cause the secondary user cannot adjust to effective power in limited steps and finally fails to transmit. Our experiment result reveals that under the help of the proposed network and reward function, the secondary user can fast and efficiently adjust to effective power from any initial state.","PeriodicalId":443426,"journal":{"name":"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)","volume":"405 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Power Control Based on Deep Q Network with Modified Reward Function in Cognitive Networks\",\"authors\":\"Fang Ye, Yinjie Zhang, Yibing Li, T. Jiang, Yingsong Li\",\"doi\":\"10.23919/USNC/URSI49741.2020.9321658\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper aims to design an appropriate power control policy of the secondary user (SU) to share the spectrum with the primary user without harmful interference. With dynamic spectrum environment, we develop a power control policy based on deep reinforcement learning with Deep Q network (DQN) that the secondary can intelligently adjust his transmit power. And reward function is properly designed to avoid the sparse reward problem which may cause the secondary user cannot adjust to effective power in limited steps and finally fails to transmit. Our experiment result reveals that under the help of the proposed network and reward function, the secondary user can fast and efficiently adjust to effective power from any initial state.\",\"PeriodicalId\":443426,\"journal\":{\"name\":\"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)\",\"volume\":\"405 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-07-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.23919/USNC/URSI49741.2020.9321658\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE USNC-CNC-URSI North American Radio Science Meeting (Joint with AP-S Symposium)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.23919/USNC/URSI49741.2020.9321658","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Power Control Based on Deep Q Network with Modified Reward Function in Cognitive Networks
This paper aims to design an appropriate power control policy of the secondary user (SU) to share the spectrum with the primary user without harmful interference. With dynamic spectrum environment, we develop a power control policy based on deep reinforcement learning with Deep Q network (DQN) that the secondary can intelligently adjust his transmit power. And reward function is properly designed to avoid the sparse reward problem which may cause the secondary user cannot adjust to effective power in limited steps and finally fails to transmit. Our experiment result reveals that under the help of the proposed network and reward function, the secondary user can fast and efficiently adjust to effective power from any initial state.