How can ignorant but patient cognitive terminals learn their strategy and utility?

2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC) Pub Date : 2010-06-20 DOI:10.1109/SPAWC.2010.5670983

S. Perlaza, H. Tembine, S. Lasaulce

{"title":"How can ignorant but patient cognitive terminals learn their strategy and utility?","authors":"S. Perlaza, H. Tembine, S. Lasaulce","doi":"10.1109/SPAWC.2010.5670983","DOIUrl":null,"url":null,"abstract":"This paper aims to contribute to bridge the gap between existing theoretical results in distributed radio resource allocation policies based on equilibria in games (assuming complete information and rational players) and practical design of signal processing algorithms for self-configuring wireless networks. For this purpose, the framework of learning theory in games is exploited. Here, a new learning algorithm based on mild information assumptions at the transmitters is presented. This algorithm possesses attractive convergence properties not available for standard reinforcement learning algorithms and in addition, it allows each transmitter to learn both its optimal strategy and the values of its expected utility for all its actions. A detailed convergence analysis is conducted. In particular, a framework for studying heterogeneous wireless networks where transmitters do not learn at the same rate is provided. The proposed algorithm, which can be applied to any wireless network verifying the information assumptions stated, is applied to the case of multiple access channels in order to provide some numerical results.","PeriodicalId":436215,"journal":{"name":"2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-06-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPAWC.2010.5670983","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

This paper aims to contribute to bridge the gap between existing theoretical results in distributed radio resource allocation policies based on equilibria in games (assuming complete information and rational players) and practical design of signal processing algorithms for self-configuring wireless networks. For this purpose, the framework of learning theory in games is exploited. Here, a new learning algorithm based on mild information assumptions at the transmitters is presented. This algorithm possesses attractive convergence properties not available for standard reinforcement learning algorithms and in addition, it allows each transmitter to learn both its optimal strategy and the values of its expected utility for all its actions. A detailed convergence analysis is conducted. In particular, a framework for studying heterogeneous wireless networks where transmitters do not learn at the same rate is provided. The proposed algorithm, which can be applied to any wireless network verifying the information assumptions stated, is applied to the case of multiple access channels in order to provide some numerical results.

查看原文本刊更多论文

无知但有耐心的认知终末期如何学习它们的策略和效用?

本文旨在弥合现有的基于博弈均衡(假设完全信息和理性参与者)的分布式无线电资源分配策略的理论结果与自配置无线网络信号处理算法的实际设计之间的差距。为此，我们利用了游戏中的学习理论框架。在此，提出了一种新的基于发射器轻度信息假设的学习算法。该算法具有标准强化学习算法所不具备的吸引人的收敛特性，此外，它允许每个发射器学习其最优策略和其所有动作的预期效用值。进行了详细的收敛分析。特别地，提供了一种用于研究发射机不以相同速率学习的异构无线网络的框架。该算法可用于验证所述信息假设的任何无线网络，并将其应用于多接入信道的情况，以获得一些数值结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2010 IEEE 11th International Workshop on Signal Processing Advances in Wireless Communications (SPAWC)

自引率

0.00%

发文量