Distributing rewards by strategic knowledge based on Nash-Q learning

Kazuo Igoshi, T. Miura, I. Shioya
{"title":"Distributing rewards by strategic knowledge based on Nash-Q learning","authors":"Kazuo Igoshi, T. Miura, I. Shioya","doi":"10.1109/ICADIWT.2008.4664393","DOIUrl":null,"url":null,"abstract":"In this investigation, we examine collaboration approach to reward distribution in repeated general-sum stochastic games by multiple game players in terms of position and rewards. There have been several investigation of reward distribution discussed so far, and reinforcement has been considered useful since no knowledge is needed in advanced and better decision can be extracted while learning. Among others, Q-learning has been paid much attention under single agent environment. However, under multi-agent environment, we donpsilat have sharp targets to this problem, what is the most optimal principle? In this work, we discuss how to distribute reward thoroughly by considering as general stochastic games based on theory of games. That is, we introduce Nash-Q approach which combines Nash equilibrium with Q-learning. We show the new approach provides us with new strategic solution. We discuss some experiments of rather complicated games (game of life) to see the usefulness of the approach.","PeriodicalId":189871,"journal":{"name":"2008 First International Conference on the Applications of Digital Information and Web Technologies (ICADIWT)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 First International Conference on the Applications of Digital Information and Web Technologies (ICADIWT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICADIWT.2008.4664393","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In this investigation, we examine collaboration approach to reward distribution in repeated general-sum stochastic games by multiple game players in terms of position and rewards. There have been several investigation of reward distribution discussed so far, and reinforcement has been considered useful since no knowledge is needed in advanced and better decision can be extracted while learning. Among others, Q-learning has been paid much attention under single agent environment. However, under multi-agent environment, we donpsilat have sharp targets to this problem, what is the most optimal principle? In this work, we discuss how to distribute reward thoroughly by considering as general stochastic games based on theory of games. That is, we introduce Nash-Q approach which combines Nash equilibrium with Q-learning. We show the new approach provides us with new strategic solution. We discuss some experiments of rather complicated games (game of life) to see the usefulness of the approach.
基于Nash-Q学习的策略性知识分配奖励
在本研究中,我们从位置和奖励的角度研究了多博弈参与者在重复一般和随机博弈中的奖励分配的合作方法。到目前为止,已经有一些关于奖励分配的研究,强化被认为是有用的,因为在高级阶段不需要知识,并且可以在学习过程中提取更好的决策。其中,单智能体环境下的q学习备受关注。然而,在多智能体环境下,我们对这个问题没有明确的目标,什么是最优的原则?本文从博弈论的角度出发,讨论了如何将奖励分配看作一般随机博弈。也就是说,我们引入了纳什均衡与q学习相结合的纳什- q方法。我们展示了新方法为我们提供了新的战略解决方案。我们讨论了一些相当复杂的游戏(生命游戏)的实验,以了解该方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信