{"title":"基于深度强化学习的动态频谱分配qos -公平性权衡方案","authors":"Le Tong, Yangyi Chen, Xin Zhou, Yifu Sun","doi":"10.1145/3487075.3487137","DOIUrl":null,"url":null,"abstract":"In order to meet the tradeoff of QoE(quality of experience)-Fairness when spectrum resources are insufficient, it is necessary to study the dynamic spectrum allocation problem, especially in the scenario where a base station who acts as a single agent wishes to reliably communicate with the multiple users by centrally managing the spectrum resources. To overcome the fact that user behavior and environment are unknown and dynamic, this paper modeled the dynamic spectrum allocation as an optimization problem, and put forward a dynamic spectrum allocation strategy which based on adaptive deep Q-learning network (ADQN). On this basis, a new reward function is designed to drive the learning process which considering different types of user's communication needs, and a priority experience replay strategy is proposed to accelerate network training speed which based on reducing time error. Moreover, simulation results show that the proposed strategy can accelerate the convergence speed of ADQN and improve the rationality and effectiveness of dynamic spectrum allocation.","PeriodicalId":354966,"journal":{"name":"Proceedings of the 5th International Conference on Computer Science and Application Engineering","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"QoE-Fairness Tradeoff Scheme for Dynamic Spectrum Allocation Based on Deep Reinforcement Learning\",\"authors\":\"Le Tong, Yangyi Chen, Xin Zhou, Yifu Sun\",\"doi\":\"10.1145/3487075.3487137\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In order to meet the tradeoff of QoE(quality of experience)-Fairness when spectrum resources are insufficient, it is necessary to study the dynamic spectrum allocation problem, especially in the scenario where a base station who acts as a single agent wishes to reliably communicate with the multiple users by centrally managing the spectrum resources. To overcome the fact that user behavior and environment are unknown and dynamic, this paper modeled the dynamic spectrum allocation as an optimization problem, and put forward a dynamic spectrum allocation strategy which based on adaptive deep Q-learning network (ADQN). On this basis, a new reward function is designed to drive the learning process which considering different types of user's communication needs, and a priority experience replay strategy is proposed to accelerate network training speed which based on reducing time error. Moreover, simulation results show that the proposed strategy can accelerate the convergence speed of ADQN and improve the rationality and effectiveness of dynamic spectrum allocation.\",\"PeriodicalId\":354966,\"journal\":{\"name\":\"Proceedings of the 5th International Conference on Computer Science and Application Engineering\",\"volume\":\"14 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-10-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 5th International Conference on Computer Science and Application Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3487075.3487137\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 5th International Conference on Computer Science and Application Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3487075.3487137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
QoE-Fairness Tradeoff Scheme for Dynamic Spectrum Allocation Based on Deep Reinforcement Learning
In order to meet the tradeoff of QoE(quality of experience)-Fairness when spectrum resources are insufficient, it is necessary to study the dynamic spectrum allocation problem, especially in the scenario where a base station who acts as a single agent wishes to reliably communicate with the multiple users by centrally managing the spectrum resources. To overcome the fact that user behavior and environment are unknown and dynamic, this paper modeled the dynamic spectrum allocation as an optimization problem, and put forward a dynamic spectrum allocation strategy which based on adaptive deep Q-learning network (ADQN). On this basis, a new reward function is designed to drive the learning process which considering different types of user's communication needs, and a priority experience replay strategy is proposed to accelerate network training speed which based on reducing time error. Moreover, simulation results show that the proposed strategy can accelerate the convergence speed of ADQN and improve the rationality and effectiveness of dynamic spectrum allocation.