Modeling the Development of Infant Imitation using Inverse Reinforcement Learning

2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob) Pub Date : 2018-09-01 DOI:10.1109/DEVLRN.2018.8761045

Ahmet E. Tekden, Emre Ugur, Y. Nagai, Erhan Öztop

{"title":"Modeling the Development of Infant Imitation using Inverse Reinforcement Learning","authors":"Ahmet E. Tekden, Emre Ugur, Y. Nagai, Erhan Öztop","doi":"10.1109/DEVLRN.2018.8761045","DOIUrl":null,"url":null,"abstract":"Little is known about the computational mechanisms of how imitation skills develop along with infant sensorimotor learning. In robotics, there are several well developed frameworks for imitation learning or so called learning by demonstration. Two paradigms dominate: Direct Learning (DL) and Inverse Reinforcement Learning (IRL). The former is a simple mechanism where the observed state and action pairs are associated to construct a copy of the action policy of the demonstrator. In the latter, an optimality principle or reward structure is sought that would explain the observed behavior as the optimal solution governed by the optimality principle or the reward function found. In this study, we explore the plausibility of whether some form of IRL mechanism in infants can facilitate imitation learning and understanding of others' behaviours. We propose that infants project the events taking place in the environment into their internal representations through a set of features that evolve during development. We implement this idea on a grid world environment, which can be considered as a simple model for reaching with obstacle avoidance. The observing infant has to imitate the demonstrator's reaching behavior through IRL by using various set of features that correspond to different stages of development. Our simulation results indicate that the U-shape performance change during imitation development observed in infants can be reproduced with the proposed model.","PeriodicalId":236346,"journal":{"name":"2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2018.8761045","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Little is known about the computational mechanisms of how imitation skills develop along with infant sensorimotor learning. In robotics, there are several well developed frameworks for imitation learning or so called learning by demonstration. Two paradigms dominate: Direct Learning (DL) and Inverse Reinforcement Learning (IRL). The former is a simple mechanism where the observed state and action pairs are associated to construct a copy of the action policy of the demonstrator. In the latter, an optimality principle or reward structure is sought that would explain the observed behavior as the optimal solution governed by the optimality principle or the reward function found. In this study, we explore the plausibility of whether some form of IRL mechanism in infants can facilitate imitation learning and understanding of others' behaviours. We propose that infants project the events taking place in the environment into their internal representations through a set of features that evolve during development. We implement this idea on a grid world environment, which can be considered as a simple model for reaching with obstacle avoidance. The observing infant has to imitate the demonstrator's reaching behavior through IRL by using various set of features that correspond to different stages of development. Our simulation results indicate that the U-shape performance change during imitation development observed in infants can be reproduced with the proposed model.

查看原文本刊更多论文

利用逆强化学习对婴儿模仿的发展进行建模

关于模仿技能如何随着婴儿感觉运动学习而发展的计算机制，人们知之甚少。在机器人技术中，有几个发展良好的框架用于模仿学习或所谓的示范学习。两种范式占主导地位:直接学习(DL)和逆强化学习(IRL)。前者是一种简单的机制，其中将观察状态和操作对关联起来，以构建演示者操作策略的副本。在后一种情况下，寻找一个最优原则或奖励结构，将观察到的行为解释为由最优原则或发现的奖励函数控制的最优解。在本研究中，我们探讨了婴儿是否存在某种形式的IRL机制，以促进模仿学习和理解他人的行为。我们认为，婴儿通过在发育过程中进化的一系列特征，将环境中发生的事件投射到他们的内部表征中。我们在网格世界环境中实现了这个想法，它可以被认为是一个简单的避障到达模型。观察婴儿必须通过IRL模仿示范者的伸手行为，使用与不同发展阶段相对应的各种特征集。我们的模拟结果表明，在婴儿模仿发展过程中观察到的u型表现变化可以用所提出的模型再现。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 Joint IEEE 8th International Conference on Development and Learning and Epigenetic Robotics (ICDL-EpiRob)

自引率

0.00%

发文量