机会频谱接入的成本意识学习与优化

2020 Information Theory and Applications Workshop (ITA) Pub Date : 2020-02-02 DOI:10.1109/ITA50056.2020.9244934

Chao Gan, Ruida Zhou, Jing Yang, Cong Shen

{"title":"机会频谱接入的成本意识学习与优化","authors":"Chao Gan, Ruida Zhou, Jing Yang, Cong Shen","doi":"10.1109/ITA50056.2020.9244934","DOIUrl":null,"url":null,"abstract":"In this paper, we investigate cost-aware joint learning and optimization for multi-channel opportunistic spectrum access in a cognitive radio system. We investigate a discrete-time model where the time axis is partitioned into frames. Each frame consists of a sensing phase, followed by a transmission phase. During the sensing phase, the user is able to sense a subset of channels sequentially before it decides to use one of them in the following transmission phase. We assume the channel states alternate between busy and idle according to independent Bernoulli random processes from frame to frame. To capture the inherent uncertainty in channel sensing, we assume the reward of each transmission when the channel is idle is a random variable. We also associate random costs with sensing and transmission actions. Our objective is to understand how the costs and reward of the actions would affect the optimal behavior of the user in both offline and online settings, and design the corresponding opportunistic spectrum access strategies to maximize the expected cumulative net reward (i.e., reward-minus-cost).We start with an offline setting where the statistics of the channel status, costs and reward are known beforehand. We show that the the optimal policy exhibits a recursive double-threshold structure, and the user needs to compare the channel statistics with those thresholds sequentially in order to decide its actions. With such insights, we then study the online setting, where the statistical information of the channels, costs and reward are unknown a priori. We judiciously balance exploration and exploitation, and show that the cumulative regret scales in O(log T). We also establish a matched lower bound, which implies that our online algorithm is order-optimal. Simulation results corroborate our theoretical analysis.","PeriodicalId":137257,"journal":{"name":"2020 Information Theory and Applications Workshop (ITA)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-02-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Cost-Aware Learning and Optimization for Opportunistic Spectrum Access\",\"authors\":\"Chao Gan, Ruida Zhou, Jing Yang, Cong Shen\",\"doi\":\"10.1109/ITA50056.2020.9244934\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we investigate cost-aware joint learning and optimization for multi-channel opportunistic spectrum access in a cognitive radio system. We investigate a discrete-time model where the time axis is partitioned into frames. Each frame consists of a sensing phase, followed by a transmission phase. During the sensing phase, the user is able to sense a subset of channels sequentially before it decides to use one of them in the following transmission phase. We assume the channel states alternate between busy and idle according to independent Bernoulli random processes from frame to frame. To capture the inherent uncertainty in channel sensing, we assume the reward of each transmission when the channel is idle is a random variable. We also associate random costs with sensing and transmission actions. Our objective is to understand how the costs and reward of the actions would affect the optimal behavior of the user in both offline and online settings, and design the corresponding opportunistic spectrum access strategies to maximize the expected cumulative net reward (i.e., reward-minus-cost).We start with an offline setting where the statistics of the channel status, costs and reward are known beforehand. We show that the the optimal policy exhibits a recursive double-threshold structure, and the user needs to compare the channel statistics with those thresholds sequentially in order to decide its actions. With such insights, we then study the online setting, where the statistical information of the channels, costs and reward are unknown a priori. We judiciously balance exploration and exploitation, and show that the cumulative regret scales in O(log T). We also establish a matched lower bound, which implies that our online algorithm is order-optimal. Simulation results corroborate our theoretical analysis.\",\"PeriodicalId\":137257,\"journal\":{\"name\":\"2020 Information Theory and Applications Workshop (ITA)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-02-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 Information Theory and Applications Workshop (ITA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ITA50056.2020.9244934\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Information Theory and Applications Workshop (ITA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ITA50056.2020.9244934","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

本文研究了认知无线电系统中多信道机会频谱接入的成本感知联合学习和优化问题。我们研究了一个离散时间模型，其中时间轴被划分为帧。每一帧包括一个感知阶段，然后是一个传输阶段。在感知阶段，用户能够在决定在下一个传输阶段使用其中一个之前依次感知信道子集。我们假设信道状态根据帧与帧之间独立的伯努利随机过程在繁忙和空闲之间交替。为了捕捉信道感知中固有的不确定性，我们假设当信道空闲时，每次传输的奖励是一个随机变量。我们还将随机成本与传感和传输行为联系起来。我们的目标是了解行动的成本和回报如何影响用户在离线和在线设置下的最佳行为，并设计相应的机会主义频谱接入策略，以最大化预期累积净回报(即奖励-成本)。我们从离线设置开始，其中渠道状态、成本和奖励的统计数据是事先知道的。我们表明，最优策略呈现递归双阈值结构，用户需要将通道统计数据与这些阈值进行顺序比较，以决定其操作。有了这样的见解，我们就研究了在线环境，其中渠道、成本和奖励的统计信息是先验未知的。我们明智地平衡了探索和开发，并表明累积后悔规模为O(log T)。我们还建立了一个匹配的下界，这意味着我们的在线算法是顺序最优的。仿真结果证实了我们的理论分析。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Cost-Aware Learning and Optimization for Opportunistic Spectrum Access

In this paper, we investigate cost-aware joint learning and optimization for multi-channel opportunistic spectrum access in a cognitive radio system. We investigate a discrete-time model where the time axis is partitioned into frames. Each frame consists of a sensing phase, followed by a transmission phase. During the sensing phase, the user is able to sense a subset of channels sequentially before it decides to use one of them in the following transmission phase. We assume the channel states alternate between busy and idle according to independent Bernoulli random processes from frame to frame. To capture the inherent uncertainty in channel sensing, we assume the reward of each transmission when the channel is idle is a random variable. We also associate random costs with sensing and transmission actions. Our objective is to understand how the costs and reward of the actions would affect the optimal behavior of the user in both offline and online settings, and design the corresponding opportunistic spectrum access strategies to maximize the expected cumulative net reward (i.e., reward-minus-cost).We start with an offline setting where the statistics of the channel status, costs and reward are known beforehand. We show that the the optimal policy exhibits a recursive double-threshold structure, and the user needs to compare the channel statistics with those thresholds sequentially in order to decide its actions. With such insights, we then study the online setting, where the statistical information of the channels, costs and reward are unknown a priori. We judiciously balance exploration and exploitation, and show that the cumulative regret scales in O(log T). We also establish a matched lower bound, which implies that our online algorithm is order-optimal. Simulation results corroborate our theoretical analysis.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 Information Theory and Applications Workshop (ITA)

自引率

0.00%

发文量