A reinforcement learning approach to cooperative problem solving

Tetsuya Yoshida, K. Hori, S. Nakasuka
{"title":"A reinforcement learning approach to cooperative problem solving","authors":"Tetsuya Yoshida, K. Hori, S. Nakasuka","doi":"10.1109/ICMAS.1998.699295","DOIUrl":null,"url":null,"abstract":"We propose an extension of reinforcement learning methods to cooperative problem solving in multi agent systems. Exploiting multiple agents for complex problems is promising, however, learning is necessary since complete domain knowledge is rarely available. The temporal difference algorithm is applied in each agent to learn a heuristic evaluation of states. Besides the reward for solutions produced by agents, we define the reward for coherence as a whole and exploit them to facilitate cooperation among agents for global problem solving. We evaluate the method by experiments on the satellite design problem. The result shows that our method enables agents to learn to cooperate as well as to learn individual heuristics within one framework. Especially, agents themselves learn to take the appropriate balance between exploration and exploitation in problem solving, which is known to greatly affect the performance. It also suggests the possibility of controlling the global behavior of multi agent systems via rewards in reinforcement learning.","PeriodicalId":244857,"journal":{"name":"Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings International Conference on Multi Agent Systems (Cat. No.98EX160)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMAS.1998.699295","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

We propose an extension of reinforcement learning methods to cooperative problem solving in multi agent systems. Exploiting multiple agents for complex problems is promising, however, learning is necessary since complete domain knowledge is rarely available. The temporal difference algorithm is applied in each agent to learn a heuristic evaluation of states. Besides the reward for solutions produced by agents, we define the reward for coherence as a whole and exploit them to facilitate cooperation among agents for global problem solving. We evaluate the method by experiments on the satellite design problem. The result shows that our method enables agents to learn to cooperate as well as to learn individual heuristics within one framework. Especially, agents themselves learn to take the appropriate balance between exploration and exploitation in problem solving, which is known to greatly affect the performance. It also suggests the possibility of controlling the global behavior of multi agent systems via rewards in reinforcement learning.
合作解决问题的强化学习方法
我们提出了一种扩展的强化学习方法,以解决多智能体系统中的合作问题。利用多个代理解决复杂问题是有希望的,然而,学习是必要的,因为很少有完整的领域知识。在每个智能体中应用时间差分算法来学习状态的启发式评估。除了对代理产生的解决方案的奖励外,我们还将一致性的奖励定义为一个整体,并利用它们来促进代理之间的合作,以解决全局问题。通过卫星设计问题的实验验证了该方法的有效性。结果表明,我们的方法使智能体既能学习合作,又能在一个框架内学习个体启发式。特别是,智能体自己在解决问题时学会在探索和利用之间取得适当的平衡,这对性能有很大的影响。它还提出了通过强化学习中的奖励来控制多智能体系统全局行为的可能性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信