Intrinsic Motivation and Introspection in Reinforcement Learning

K. Merrick
{"title":"Intrinsic Motivation and Introspection in Reinforcement Learning","authors":"K. Merrick","doi":"10.1109/TAMD.2012.2208457","DOIUrl":null,"url":null,"abstract":"Incorporating intrinsic motivation with reinforcement learning can permit agents to independently choose, which skills they will develop, or to change their focus of attention to learn different skills at different times. This implies an autonomous developmental process for skills in which a skill-acquisition goal is first identified, then a skill is learned to solve the goal. The learned skill may then be stored, reused, temporarily ignored or even permanently erased. This paper formalizes the developmental process for skills by proposing a goal-lifecycle using the option framework for motivated reinforcement learning agents. The paper shows how the goal-lifecycle can be used as a basis for designing motivational state-spaces that permit agents to reason introspectively and autonomously about when to learn skills to solve goals, when to activate skills, when to suspend activation of skills or when to delete skills. An algorithm is presented that simultaneously learns: 1) an introspective policy mapping motivational states to decisions that change the agent's motivational state, and 2) multiple option policies mapping sensed states and actions to achieve various domain-specific goals. Two variations of agents using this model are compared to motivated reinforcement learning agents without introspection for controlling non-player characters in a computer game scenario. Results show that agents using introspection can focus their attention on learning more complex skills than agents without introspection. In addition, they can learn these skills more effectively.","PeriodicalId":49193,"journal":{"name":"IEEE Transactions on Autonomous Mental Development","volume":"4 1","pages":"315-329"},"PeriodicalIF":0.0000,"publicationDate":"2012-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/TAMD.2012.2208457","citationCount":"21","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Autonomous Mental Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/TAMD.2012.2208457","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

Abstract

Incorporating intrinsic motivation with reinforcement learning can permit agents to independently choose, which skills they will develop, or to change their focus of attention to learn different skills at different times. This implies an autonomous developmental process for skills in which a skill-acquisition goal is first identified, then a skill is learned to solve the goal. The learned skill may then be stored, reused, temporarily ignored or even permanently erased. This paper formalizes the developmental process for skills by proposing a goal-lifecycle using the option framework for motivated reinforcement learning agents. The paper shows how the goal-lifecycle can be used as a basis for designing motivational state-spaces that permit agents to reason introspectively and autonomously about when to learn skills to solve goals, when to activate skills, when to suspend activation of skills or when to delete skills. An algorithm is presented that simultaneously learns: 1) an introspective policy mapping motivational states to decisions that change the agent's motivational state, and 2) multiple option policies mapping sensed states and actions to achieve various domain-specific goals. Two variations of agents using this model are compared to motivated reinforcement learning agents without introspection for controlling non-player characters in a computer game scenario. Results show that agents using introspection can focus their attention on learning more complex skills than agents without introspection. In addition, they can learn these skills more effectively.
强化学习中的内在动机与内省
将内在动机与强化学习结合起来,可以让智能体独立选择他们将发展的技能,或者改变他们的注意力焦点,在不同的时间学习不同的技能。这意味着技能的自主发展过程,首先确定技能获取目标,然后学习技能来解决目标。然后,学习的技能可能被存储、重用、暂时忽略甚至永久删除。本文通过使用动机强化学习代理的选项框架提出目标生命周期,形式化了技能的发展过程。本文展示了如何将目标生命周期用作设计动机状态空间的基础,该空间允许代理自省和自主地推理何时学习技能以解决目标,何时激活技能,何时暂停激活技能或何时删除技能。提出了一种同时学习的算法:1)内省策略将动机状态映射到改变代理动机状态的决策;2)多选项策略映射感知状态和动作以实现各种特定领域目标。在计算机游戏场景中,将使用该模型的两种代理变体与没有内省的动机强化学习代理进行比较,以控制非玩家角色。结果表明,使用内省的智能体比不使用内省的智能体更能集中注意力学习更复杂的技能。此外,他们可以更有效地学习这些技能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Autonomous Mental Development
IEEE Transactions on Autonomous Mental Development COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-ROBOTICS
自引率
0.00%
发文量
0
审稿时长
3 months
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信