{"title":"An intrinsic reward for affordance exploration","authors":"Stephen Hart","doi":"10.1109/DEVLRN.2009.5175542","DOIUrl":null,"url":null,"abstract":"In this paper, we present preliminary results demonstrating how a robot can learn environmental affordances in terms of the features that predict successful control and interaction. We extend previous work in which we proposed a learning framework that allows a robot to develop a series of hierarchical, closed-loop manipulation behaviors. Here, we examine a complementary process where the robot builds probabilistic models about the conditions under which these behaviors are likely to succeed. To accomplish this, we present an intrinsic reward function that directs the robot's exploratory behavior towards gaining confidence in these models. We demonstrate how this single intrinsic motivator can lead to artifacts of behavior such as “novelty,” “habituation,” and “surprise.” We present results using the bimanual robot Dexter, and explore these results further in simulation.","PeriodicalId":192225,"journal":{"name":"2009 IEEE 8th International Conference on Development and Learning","volume":"84 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE 8th International Conference on Development and Learning","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2009.5175542","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
In this paper, we present preliminary results demonstrating how a robot can learn environmental affordances in terms of the features that predict successful control and interaction. We extend previous work in which we proposed a learning framework that allows a robot to develop a series of hierarchical, closed-loop manipulation behaviors. Here, we examine a complementary process where the robot builds probabilistic models about the conditions under which these behaviors are likely to succeed. To accomplish this, we present an intrinsic reward function that directs the robot's exploratory behavior towards gaining confidence in these models. We demonstrate how this single intrinsic motivator can lead to artifacts of behavior such as “novelty,” “habituation,” and “surprise.” We present results using the bimanual robot Dexter, and explore these results further in simulation.