An intrinsic reward for affordance exploration

Stephen Hart
{"title":"An intrinsic reward for affordance exploration","authors":"Stephen Hart","doi":"10.1109/DEVLRN.2009.5175542","DOIUrl":null,"url":null,"abstract":"In this paper, we present preliminary results demonstrating how a robot can learn environmental affordances in terms of the features that predict successful control and interaction. We extend previous work in which we proposed a learning framework that allows a robot to develop a series of hierarchical, closed-loop manipulation behaviors. Here, we examine a complementary process where the robot builds probabilistic models about the conditions under which these behaviors are likely to succeed. To accomplish this, we present an intrinsic reward function that directs the robot's exploratory behavior towards gaining confidence in these models. We demonstrate how this single intrinsic motivator can lead to artifacts of behavior such as “novelty,” “habituation,” and “surprise.” We present results using the bimanual robot Dexter, and explore these results further in simulation.","PeriodicalId":192225,"journal":{"name":"2009 IEEE 8th International Conference on Development and Learning","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE 8th International Conference on Development and Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2009.5175542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16

Abstract

In this paper, we present preliminary results demonstrating how a robot can learn environmental affordances in terms of the features that predict successful control and interaction. We extend previous work in which we proposed a learning framework that allows a robot to develop a series of hierarchical, closed-loop manipulation behaviors. Here, we examine a complementary process where the robot builds probabilistic models about the conditions under which these behaviors are likely to succeed. To accomplish this, we present an intrinsic reward function that directs the robot's exploratory behavior towards gaining confidence in these models. We demonstrate how this single intrinsic motivator can lead to artifacts of behavior such as “novelty,” “habituation,” and “surprise.” We present results using the bimanual robot Dexter, and explore these results further in simulation.
对功能探索的内在奖励
在本文中,我们展示了初步结果,展示了机器人如何从预测成功控制和交互的特征方面学习环境可视性。我们扩展了之前的工作,我们提出了一个学习框架,允许机器人开发一系列分层的闭环操作行为。在这里,我们研究了一个互补的过程,在这个过程中,机器人建立了关于这些行为可能成功的条件的概率模型。为了实现这一点,我们提出了一个内在奖励函数,指导机器人的探索行为,以获得对这些模型的信心。我们展示了这个单一的内在动机是如何导致诸如“新奇”、“习惯化”和“惊喜”等行为的人工制品的。我们介绍了使用双手机器人Dexter的结果,并在仿真中进一步探讨了这些结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信