Learning and decision-making in artificial animals

Claes Strannegård, Nils Svangård, David Lindström, Joscha Bach, Bas R. Steunebrink
{"title":"Learning and decision-making in artificial animals","authors":"Claes Strannegård, Nils Svangård, David Lindström, Joscha Bach, Bas R. Steunebrink","doi":"10.2478/jagi-2018-0002","DOIUrl":null,"url":null,"abstract":"Abstract A computational model for artificial animals (animats) interacting with real or artificial ecosystems is presented. All animats use the same mechanisms for learning and decisionmaking. Each animat has its own set of needs and its own memory structure that undergoes continuous development and constitutes the basis for decision-making. The decision-making mechanism aims at keeping the needs of the animat as satisfied as possible for as long as possible. Reward and punishment are defined in terms of changes to the level of need satisfaction. The learning mechanisms are driven by prediction error relating to reward and punishment and are of two kinds: multi-objective local Q-learning and structural learning that alter the architecture of the memory structures by adding and removing nodes. The animat model has the following key properties: (1) autonomy: it operates in a fully automatic fashion, without any need for interaction with human engineers. In particular, it does not depend on human engineers to provide goals, tasks, or seed knowledge. Still, it can operate either with or without human interaction; (2) generality: it uses the same learning and decision-making mechanisms in all environments, e.g. desert environments and forest environments and for all animats, e.g. frog animats and bee animats; and (3) adequacy: it is able to learn basic forms of animal skills such as eating, drinking, locomotion, and navigation. Eight experiments are presented. The results obtained indicate that (i) dynamic memory structures are strictly more powerful than static; (ii) it is possible to use a fixed generic design to model basic cognitive processes of a wide range of animals and environments; and (iii) the animat framework enables a uniform and gradual approach to AGI, by successively taking on more challenging problems in the form of broader and more complex classes of environments","PeriodicalId":247142,"journal":{"name":"Journal of Artificial General Intelligence","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Artificial General Intelligence","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2478/jagi-2018-0002","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5

Abstract

Abstract A computational model for artificial animals (animats) interacting with real or artificial ecosystems is presented. All animats use the same mechanisms for learning and decisionmaking. Each animat has its own set of needs and its own memory structure that undergoes continuous development and constitutes the basis for decision-making. The decision-making mechanism aims at keeping the needs of the animat as satisfied as possible for as long as possible. Reward and punishment are defined in terms of changes to the level of need satisfaction. The learning mechanisms are driven by prediction error relating to reward and punishment and are of two kinds: multi-objective local Q-learning and structural learning that alter the architecture of the memory structures by adding and removing nodes. The animat model has the following key properties: (1) autonomy: it operates in a fully automatic fashion, without any need for interaction with human engineers. In particular, it does not depend on human engineers to provide goals, tasks, or seed knowledge. Still, it can operate either with or without human interaction; (2) generality: it uses the same learning and decision-making mechanisms in all environments, e.g. desert environments and forest environments and for all animats, e.g. frog animats and bee animats; and (3) adequacy: it is able to learn basic forms of animal skills such as eating, drinking, locomotion, and navigation. Eight experiments are presented. The results obtained indicate that (i) dynamic memory structures are strictly more powerful than static; (ii) it is possible to use a fixed generic design to model basic cognitive processes of a wide range of animals and environments; and (iii) the animat framework enables a uniform and gradual approach to AGI, by successively taking on more challenging problems in the form of broader and more complex classes of environments
人工动物的学习和决策
摘要提出了一种人工动物与真实或人工生态系统相互作用的计算模型。所有动物都使用相同的学习和决策机制。每种动物都有自己的一套需求和自己的记忆结构,它们经历了不断的发展,构成了决策的基础。决策机制旨在尽可能长时间地满足动物的需求。奖励和惩罚是根据需求满足程度的变化来定义的。学习机制是由与奖惩相关的预测误差驱动的,有两种类型:多目标局部q学习和结构学习,通过增加和删除节点来改变记忆结构的结构。动物模型具有以下关键属性:(1)自主性:它以全自动的方式运行,不需要与人类工程师进行任何交互。特别是,它不依赖于人类工程师来提供目标、任务或种子知识。尽管如此,它可以在有或没有人类互动的情况下运行;(2)通用性:在所有环境(如沙漠环境和森林环境)和所有动物(如青蛙动物和蜜蜂动物)中使用相同的学习和决策机制;(3)充分性:能够学习基本形式的动物技能,如吃、喝、运动和导航。给出了八个实验。结果表明:(1)动态存储结构比静态存储结构更强大;(ii)可以使用固定的通用设计来模拟各种动物和环境的基本认知过程;(iii)动物框架通过以更广泛和更复杂的环境类别的形式依次处理更具挑战性的问题,从而实现统一和渐进的AGI方法
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信