Interactive multiagent reinforcement learning with motivation rules

T. Yamaguchi, Ryo Marukawa
{"title":"Interactive multiagent reinforcement learning with motivation rules","authors":"T. Yamaguchi, Ryo Marukawa","doi":"10.1109/ICCIMA.2001.970456","DOIUrl":null,"url":null,"abstract":"Presents a new framework of multi-agent reinforcement learning to acquire cooperative behaviors by generating and coordinating each learning goal interactively among agents. One of the main goals of artificial intelligence is to realize an intelligent agent that behaves autonomously by its sense of values. Reinforcement learning (RL) is the major learning mechanism for the agent to adapt itself to various situations of an unknown environment flexibly. However, in a multi-agent system environment that has mutual dependency among agents, it is difficult for a human to set up suitable learning goals for each agent, and, in addition, the existing framework of RL that aims for egoistic optimality of each agent is inadequate. Therefore, an active and interactive learning mechanism is required to generate and coordinate each learning goal among the agents. To realize this, first we propose to treat each learning goal as a reinforcement signal (RS) that can be communicated among the agents. Second, we introduce motivation rules to integrate the RSs communicated among the agents into a reward value for RL of an agent. Then we define cooperative rewards as learning goals with mutual dependency. Learning experiments for two agents with various motivation rules are performed. The experimental results show that several combinations of motivation rules converge to cooperative behaviors.","PeriodicalId":232504,"journal":{"name":"Proceedings Fourth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2001","volume":"68 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2001-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings Fourth International Conference on Computational Intelligence and Multimedia Applications. ICCIMA 2001","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCIMA.2001.970456","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Presents a new framework of multi-agent reinforcement learning to acquire cooperative behaviors by generating and coordinating each learning goal interactively among agents. One of the main goals of artificial intelligence is to realize an intelligent agent that behaves autonomously by its sense of values. Reinforcement learning (RL) is the major learning mechanism for the agent to adapt itself to various situations of an unknown environment flexibly. However, in a multi-agent system environment that has mutual dependency among agents, it is difficult for a human to set up suitable learning goals for each agent, and, in addition, the existing framework of RL that aims for egoistic optimality of each agent is inadequate. Therefore, an active and interactive learning mechanism is required to generate and coordinate each learning goal among the agents. To realize this, first we propose to treat each learning goal as a reinforcement signal (RS) that can be communicated among the agents. Second, we introduce motivation rules to integrate the RSs communicated among the agents into a reward value for RL of an agent. Then we define cooperative rewards as learning goals with mutual dependency. Learning experiments for two agents with various motivation rules are performed. The experimental results show that several combinations of motivation rules converge to cooperative behaviors.
基于动机规则的交互式多智能体强化学习
提出了一种新的多智能体强化学习框架,通过智能体之间的交互生成和协调每个学习目标来获得合作行为。人工智能的主要目标之一是实现一个能够根据其价值观自主行为的智能体。强化学习(Reinforcement learning, RL)是智能体灵活适应未知环境各种情况的主要学习机制。然而,在智能体之间相互依赖的多智能体系统环境中,人类很难为每个智能体设定合适的学习目标,而且现有的以每个智能体的利己最优为目标的强化学习框架也存在不足。因此,需要一种主动和互动的学习机制来生成和协调智能体之间的每个学习目标。为了实现这一点,首先我们建议将每个学习目标视为一个可以在智能体之间传递的强化信号(RS)。其次,我们引入激励规则,将智能体之间沟通的RSs整合为智能体RL的奖励值。然后我们将合作奖励定义为相互依赖的学习目标。对两个具有不同动机规则的智能体进行了学习实验。实验结果表明,多种激励规则组合收敛为合作行为。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信