The two-dimensional organization of behavior

2011 IEEE International Conference on Development and Learning (ICDL) Pub Date : 2011-10-10 DOI:10.1109/DEVLRN.2011.6037326

Mark B. Ring, T. Schaul, J. Schmidhuber

{"title":"The two-dimensional organization of behavior","authors":"Mark B. Ring, T. Schaul, J. Schmidhuber","doi":"10.1109/DEVLRN.2011.6037326","DOIUrl":null,"url":null,"abstract":"This paper addresses the problem of continual learning [1] in a new way, combining multi-modular reinforcement learning with inspiration from the motor cortex to produce a unique perspective on hierarchical behavior. Most reinforcement-learning agents represent policies monolithically using a single table or function approximator. In those cases where the policies are split among a few different modules, these modules are related to each other only in that they work together to produce the agent's overall policy. In contrast, the brain appears to organize motor behavior in a two-dimensional map, where nearby locations represent similar behaviors. This representation allows the brain to build hierarchies of motor behavior that correspond not to hierarchies of subroutines but to regions of the map such that larger regions correspond to more general behaviors. Inspired by the benefits of the brain's representation, the system presented here is a first step and the first attempt toward the two-dimensional organization of learned policies according to behavioral similarity. We demonstrate a fully autonomous multi-modular system designed for the constant accumulation of ever more sophisticated skills (the continual-learning problem). The system can split up a complex task among a large number of simple modules such that nearby modules correspond to similar policies. The eventual goal is to develop and use the resulting organization hierarchically, accessing behaviors by their location and extent in the map.","PeriodicalId":256921,"journal":{"name":"2011 IEEE International Conference on Development and Learning (ICDL)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Development and Learning (ICDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2011.6037326","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 20

Abstract

This paper addresses the problem of continual learning [1] in a new way, combining multi-modular reinforcement learning with inspiration from the motor cortex to produce a unique perspective on hierarchical behavior. Most reinforcement-learning agents represent policies monolithically using a single table or function approximator. In those cases where the policies are split among a few different modules, these modules are related to each other only in that they work together to produce the agent's overall policy. In contrast, the brain appears to organize motor behavior in a two-dimensional map, where nearby locations represent similar behaviors. This representation allows the brain to build hierarchies of motor behavior that correspond not to hierarchies of subroutines but to regions of the map such that larger regions correspond to more general behaviors. Inspired by the benefits of the brain's representation, the system presented here is a first step and the first attempt toward the two-dimensional organization of learned policies according to behavioral similarity. We demonstrate a fully autonomous multi-modular system designed for the constant accumulation of ever more sophisticated skills (the continual-learning problem). The system can split up a complex task among a large number of simple modules such that nearby modules correspond to similar policies. The eventual goal is to develop and use the resulting organization hierarchically, accessing behaviors by their location and extent in the map.

查看原文本刊更多论文

行为的二维组织

本文以一种新的方式解决了持续学习问题[1]，将多模块强化学习与来自运动皮层的灵感相结合，产生了对分层行为的独特视角。大多数强化学习代理使用单个表或函数近似器来整体地表示策略。在策略被划分为几个不同模块的情况下，这些模块之间只有在它们一起工作以生成代理的总体策略时才相互关联。相比之下，大脑似乎在二维地图上组织运动行为，其中附近的位置代表类似的行为。这种表现允许大脑建立运动行为的层次结构，而不是与子程序的层次结构相对应，而是与地图的区域相对应，这样更大的区域对应更一般的行为。受到大脑表征的好处的启发，这里提出的系统是根据行为相似性对学习策略进行二维组织的第一步和第一次尝试。我们展示了一个完全自主的多模块系统，设计用于不断积累越来越复杂的技能(持续学习问题)。系统可以将一个复杂的任务拆分为大量的简单模块，使附近的模块对应相似的策略。最终的目标是分层开发和使用生成的组织，根据它们在地图中的位置和范围访问行为。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE International Conference on Development and Learning (ICDL)

自引率

0.00%

发文量