Greedy confidence bound techniques for restless multi-armed bandit based Cognitive Radio

2013 47th Annual Conference on Information Sciences and Systems (CISS) Pub Date : 2013-03-20 DOI:10.1109/CISS.2013.6552267

Shuya Dong, Jungwoo Lee

引用次数: 1

Abstract

In this paper, we deal with Bayesian restless multi-armed bandit (RMAB) techniques which are appliced to Cognitive Radio. We assume there are multiple arms, each of which evolves as a Markov chain with known parameters. A player seeks to activate more than one arms at each time in order to maximize the expected total reward with multiple plays. We consider non-Bayesian RMAB where the parameters of the Markov chain are unknown. We propose a simple but effective algorithm called two-slot greedy confidence bound algorithm (Two-slot GCB), which perform better than existing upper confidence bound (UCB) algorithms.

查看原文本刊更多论文

基于不宁多臂强盗认知无线电的贪婪置信边界技术

本文研究了应用于认知无线电的贝叶斯不宁多臂强盗(RMAB)技术。我们假设有多条臂，每条臂都演变成具有已知参数的马尔可夫链。玩家每次都要激活多个武器，以便在多次游戏中最大化预期的总奖励。我们考虑马尔可夫链参数未知的非贝叶斯RMAB。本文提出了一种简单而有效的算法——双槽贪婪置信边界算法(two-slot GCB)，该算法的性能优于现有的上置信边界算法(UCB)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2013 47th Annual Conference on Information Sciences and Systems (CISS)

自引率

0.00%

发文量