{"title":"具有QoS保证的平稳和非平稳OSA场景下的高效学习","authors":"Navikkumar Modi, P. Mary, C. Moy","doi":"10.4108/eai.9-1-2017.152098","DOIUrl":null,"url":null,"abstract":"In this work, the opportunistic spectrum access (OSA) problem is addressed with stationary and nonstationary Markov multi-armed bandit (MAB) frameworks. We propose a novel index based algorithm named QoS-UCB that balances exploration in terms of occupancy and quality, e.g. signal to noise ratio (SNR) for transmission, for stationary environments. Furthermore, we propose another learning policy, named discounted QoS-UCB (DQoS-UCB), for the non-stationary case. Our contribution in terms of numerical analysis is twofold: i) In stationary OSA scenario, we numerically compare our QoS-UCB policy with an existing UCB1 and also show that QoS-UCB outperforms UCB1 in terms of regret and ii) in non-stationary OSA scenario, numerical results state that proposed DQoS-UCB policy outperforms other simple UCBs and also QoS-UCB policy. To the best of our knowledge, this is the first learning algorithm which adapts to nonstationary Markov MAB framework and also quantifies channel quality information. Received on XXXX; accepted on XXXX; published on XXXX","PeriodicalId":288158,"journal":{"name":"EAI Endorsed Trans. Wirel. Spectr.","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty\",\"authors\":\"Navikkumar Modi, P. Mary, C. Moy\",\"doi\":\"10.4108/eai.9-1-2017.152098\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work, the opportunistic spectrum access (OSA) problem is addressed with stationary and nonstationary Markov multi-armed bandit (MAB) frameworks. We propose a novel index based algorithm named QoS-UCB that balances exploration in terms of occupancy and quality, e.g. signal to noise ratio (SNR) for transmission, for stationary environments. Furthermore, we propose another learning policy, named discounted QoS-UCB (DQoS-UCB), for the non-stationary case. Our contribution in terms of numerical analysis is twofold: i) In stationary OSA scenario, we numerically compare our QoS-UCB policy with an existing UCB1 and also show that QoS-UCB outperforms UCB1 in terms of regret and ii) in non-stationary OSA scenario, numerical results state that proposed DQoS-UCB policy outperforms other simple UCBs and also QoS-UCB policy. To the best of our knowledge, this is the first learning algorithm which adapts to nonstationary Markov MAB framework and also quantifies channel quality information. Received on XXXX; accepted on XXXX; published on XXXX\",\"PeriodicalId\":288158,\"journal\":{\"name\":\"EAI Endorsed Trans. Wirel. Spectr.\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-01-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"EAI Endorsed Trans. Wirel. Spectr.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.4108/eai.9-1-2017.152098\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"EAI Endorsed Trans. Wirel. Spectr.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.4108/eai.9-1-2017.152098","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficient Learning in Stationary and Non-stationary OSA Scenario with QoS Guaranty
In this work, the opportunistic spectrum access (OSA) problem is addressed with stationary and nonstationary Markov multi-armed bandit (MAB) frameworks. We propose a novel index based algorithm named QoS-UCB that balances exploration in terms of occupancy and quality, e.g. signal to noise ratio (SNR) for transmission, for stationary environments. Furthermore, we propose another learning policy, named discounted QoS-UCB (DQoS-UCB), for the non-stationary case. Our contribution in terms of numerical analysis is twofold: i) In stationary OSA scenario, we numerically compare our QoS-UCB policy with an existing UCB1 and also show that QoS-UCB outperforms UCB1 in terms of regret and ii) in non-stationary OSA scenario, numerical results state that proposed DQoS-UCB policy outperforms other simple UCBs and also QoS-UCB policy. To the best of our knowledge, this is the first learning algorithm which adapts to nonstationary Markov MAB framework and also quantifies channel quality information. Received on XXXX; accepted on XXXX; published on XXXX