Manish Anand Yadav, Yuhui Li, Guangjin Fang, Bin Shen
{"title":"基于深度q网络的分布式动态频谱接入强化学习","authors":"Manish Anand Yadav, Yuhui Li, Guangjin Fang, Bin Shen","doi":"10.1109/CCAI55564.2022.9807797","DOIUrl":null,"url":null,"abstract":"To solve the problem of spectrum scarcity and spectrum under-utilization in wireless networks, we propose a double deep Q-network based reinforcement learning algorithm for distributed dynamic spectrum access. Channels in the network are either busy or idle based on the two-state Markov chain. At the start of each time slot, every secondary user (SU) performs spectrum sensing on each channel and accesses one based on the sensing result as well as the output of the Q-network of our algorithm. Over time, the Deep Reinforcement Learning (DRL) algorithm learns the spectrum environment and becomes good at modeling the behavior pattern of the primary users (PUs). Through simulation, we show that our proposed algorithm is simple to train, yet effective in reducing interference to primary as well as secondary users and achieving higher successful transmission.","PeriodicalId":340195,"journal":{"name":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Deep Q-network Based Reinforcement Learning for Distributed Dynamic Spectrum Access\",\"authors\":\"Manish Anand Yadav, Yuhui Li, Guangjin Fang, Bin Shen\",\"doi\":\"10.1109/CCAI55564.2022.9807797\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To solve the problem of spectrum scarcity and spectrum under-utilization in wireless networks, we propose a double deep Q-network based reinforcement learning algorithm for distributed dynamic spectrum access. Channels in the network are either busy or idle based on the two-state Markov chain. At the start of each time slot, every secondary user (SU) performs spectrum sensing on each channel and accesses one based on the sensing result as well as the output of the Q-network of our algorithm. Over time, the Deep Reinforcement Learning (DRL) algorithm learns the spectrum environment and becomes good at modeling the behavior pattern of the primary users (PUs). Through simulation, we show that our proposed algorithm is simple to train, yet effective in reducing interference to primary as well as secondary users and achieving higher successful transmission.\",\"PeriodicalId\":340195,\"journal\":{\"name\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"volume\":\"62 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCAI55564.2022.9807797\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE 2nd International Conference on Computer Communication and Artificial Intelligence (CCAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCAI55564.2022.9807797","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Deep Q-network Based Reinforcement Learning for Distributed Dynamic Spectrum Access
To solve the problem of spectrum scarcity and spectrum under-utilization in wireless networks, we propose a double deep Q-network based reinforcement learning algorithm for distributed dynamic spectrum access. Channels in the network are either busy or idle based on the two-state Markov chain. At the start of each time slot, every secondary user (SU) performs spectrum sensing on each channel and accesses one based on the sensing result as well as the output of the Q-network of our algorithm. Over time, the Deep Reinforcement Learning (DRL) algorithm learns the spectrum environment and becomes good at modeling the behavior pattern of the primary users (PUs). Through simulation, we show that our proposed algorithm is simple to train, yet effective in reducing interference to primary as well as secondary users and achieving higher successful transmission.