结合想象力和启发式学习泛化策略

Erik J Peterson, Necati Alp Müyesser, T. Verstynen, Kyle Dunovan
{"title":"结合想象力和启发式学习泛化策略","authors":"Erik J Peterson, Necati Alp Müyesser, T. Verstynen, Kyle Dunovan","doi":"10.51628/001c.13477","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning can match or exceed human performance in stable contexts, but with minor changes to the environment artificial networks, unlike humans, often cannot adapt. Humans rely on a combination of heuristics to simplify computational load and imagination to extend experiential learning to new and more challenging environments. Motivated by theories of the hierarchical organization of the human prefrontal networks, we have developed a model of hierarchical reinforcement learning that combines both heuristics and imagination into a “stumbler-strategist” network. We test performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that a heuristic labeling of each position as hot or cold, combined with imagined play, both accelerates learning and promotes transfer to novel games, while also improving model interpretability","PeriodicalId":74289,"journal":{"name":"Neurons, behavior, data analysis and theory","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2018-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Combining Imagination and Heuristics to Learn Strategies that Generalize\",\"authors\":\"Erik J Peterson, Necati Alp Müyesser, T. Verstynen, Kyle Dunovan\",\"doi\":\"10.51628/001c.13477\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep reinforcement learning can match or exceed human performance in stable contexts, but with minor changes to the environment artificial networks, unlike humans, often cannot adapt. Humans rely on a combination of heuristics to simplify computational load and imagination to extend experiential learning to new and more challenging environments. Motivated by theories of the hierarchical organization of the human prefrontal networks, we have developed a model of hierarchical reinforcement learning that combines both heuristics and imagination into a “stumbler-strategist” network. We test performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that a heuristic labeling of each position as hot or cold, combined with imagined play, both accelerates learning and promotes transfer to novel games, while also improving model interpretability\",\"PeriodicalId\":74289,\"journal\":{\"name\":\"Neurons, behavior, data analysis and theory\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-09-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Neurons, behavior, data analysis and theory\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.51628/001c.13477\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Neurons, behavior, data analysis and theory","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.51628/001c.13477","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

深度强化学习可以在稳定的环境中匹配或超过人类的表现,但与人类不同,人工网络在环境发生微小变化时往往无法适应。人类依靠启发式的组合来简化计算负荷和想象力,将体验式学习扩展到新的和更具挑战性的环境。受人类前额叶网络分层组织理论的启发,我们开发了一种分层强化学习模型,该模型将启发式和想象力结合到“绊倒-战略家”网络中。我们使用Wythoff游戏测试了该网络的性能,这是一个具有已知最优策略的网格世界环境。我们表明,将每个位置标记为热或冷的启发式标签,结合想象游戏,既加速了学习,又促进了向新游戏的迁移,同时也提高了模型的可解释性
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Combining Imagination and Heuristics to Learn Strategies that Generalize
Deep reinforcement learning can match or exceed human performance in stable contexts, but with minor changes to the environment artificial networks, unlike humans, often cannot adapt. Humans rely on a combination of heuristics to simplify computational load and imagination to extend experiential learning to new and more challenging environments. Motivated by theories of the hierarchical organization of the human prefrontal networks, we have developed a model of hierarchical reinforcement learning that combines both heuristics and imagination into a “stumbler-strategist” network. We test performance of this network using Wythoff’s game, a gridworld environment with a known optimal strategy. We show that a heuristic labeling of each position as hot or cold, combined with imagined play, both accelerates learning and promotes transfer to novel games, while also improving model interpretability
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信