H. Kamal, M. Coupechoux, P. Godlewski, J. Kélif
{"title":"具有协调访问频带的蜂窝网络的最优启发式和基于q学习的DSA策略","authors":"H. Kamal, M. Coupechoux, P. Godlewski, J. Kélif","doi":"10.1002/ett.1456","DOIUrl":null,"url":null,"abstract":"Due to the increasing demands for higher data rate applications, also due to the actual spectrum crowd situation, Dynamic Spectrum Access (DSA) turned into an active research topic. In this paper, we analyse DSA in cellular networks context, where a Coordinated Access Band (CAB) is shared between Radio Access Networks (RANs). We propose a Semi-Markov Decision Process (SMDP) approach to derive the optimal DSA policies in terms of operator reward. In order to overcome the limitations induced by optimal policy implementation, we also propose two simple, though sub-optimal, DSA algorithms: a Q-learning (QL) based algorithm and a heuristic algorithm. The achieved reward using the latter is shown to be very close to the optimal case and thus to significantly exceed the reward obtained with Fixed Spectrum Access (FSA). The rewards achieved by using the QL-based algorithm are shown to exceed those obtained by using FSA. Higher rewards and better spectrum utilisation with DSA optimal and heuristic methods are, however, obtained at the price of a reduced average user throughput. Copyright © 2010 John Wiley & Sons, Ltd.","PeriodicalId":50473,"journal":{"name":"European Transactions on Telecommunications","volume":"106 1","pages":"694-703"},"PeriodicalIF":0.0000,"publicationDate":"2010-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Optimal, heuristic and Q-learning based DSA policies for cellular networks with coordinated access band\",\"authors\":\"H. Kamal, M. Coupechoux, P. Godlewski, J. Kélif\",\"doi\":\"10.1002/ett.1456\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the increasing demands for higher data rate applications, also due to the actual spectrum crowd situation, Dynamic Spectrum Access (DSA) turned into an active research topic. In this paper, we analyse DSA in cellular networks context, where a Coordinated Access Band (CAB) is shared between Radio Access Networks (RANs). We propose a Semi-Markov Decision Process (SMDP) approach to derive the optimal DSA policies in terms of operator reward. In order to overcome the limitations induced by optimal policy implementation, we also propose two simple, though sub-optimal, DSA algorithms: a Q-learning (QL) based algorithm and a heuristic algorithm. The achieved reward using the latter is shown to be very close to the optimal case and thus to significantly exceed the reward obtained with Fixed Spectrum Access (FSA). The rewards achieved by using the QL-based algorithm are shown to exceed those obtained by using FSA. Higher rewards and better spectrum utilisation with DSA optimal and heuristic methods are, however, obtained at the price of a reduced average user throughput. Copyright © 2010 John Wiley & Sons, Ltd.\",\"PeriodicalId\":50473,\"journal\":{\"name\":\"European Transactions on Telecommunications\",\"volume\":\"106 1\",\"pages\":\"694-703\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Transactions on Telecommunications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/ett.1456\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Transactions on Telecommunications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/ett.1456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Optimal, heuristic and Q-learning based DSA policies for cellular networks with coordinated access band
Due to the increasing demands for higher data rate applications, also due to the actual spectrum crowd situation, Dynamic Spectrum Access (DSA) turned into an active research topic. In this paper, we analyse DSA in cellular networks context, where a Coordinated Access Band (CAB) is shared between Radio Access Networks (RANs). We propose a Semi-Markov Decision Process (SMDP) approach to derive the optimal DSA policies in terms of operator reward. In order to overcome the limitations induced by optimal policy implementation, we also propose two simple, though sub-optimal, DSA algorithms: a Q-learning (QL) based algorithm and a heuristic algorithm. The achieved reward using the latter is shown to be very close to the optimal case and thus to significantly exceed the reward obtained with Fixed Spectrum Access (FSA). The rewards achieved by using the QL-based algorithm are shown to exceed those obtained by using FSA. Higher rewards and better spectrum utilisation with DSA optimal and heuristic methods are, however, obtained at the price of a reduced average user throughput. Copyright © 2010 John Wiley & Sons, Ltd.