Behavioral Cloning from Observation with Bi-directional Dynamics Model

2021 IEEE/SICE International Symposium on System Integration (SII) Pub Date : 2021-01-11 DOI:10.1109/IEEECONF49454.2021.9382751

Tobias Betz, Hidehito Fujiishi, Taisuke Kobayashi

{"title":"Behavioral Cloning from Observation with Bi-directional Dynamics Model","authors":"Tobias Betz, Hidehito Fujiishi, Taisuke Kobayashi","doi":"10.1109/IEEECONF49454.2021.9382751","DOIUrl":null,"url":null,"abstract":"Robotics is rapidly expanding its workplace from industrial factories to more complicated fields to work on behalf of human. The difficulty of programing the operations human does in advance, however, prevents this expansion. Behavioral cloning is one of the promising approaches to acquire the operations effectively from the expert’s demonstrations, which consist of the states and the performed actions of the expert. However, it is intractable and/or highly expensive for robots to measure the expert’s actions. Behavioral cloning from observation fills this gap and makes it possible to imitate with state-only demonstrations by inferring actions that the expert performed from an inverse dynamics model. Our goal is to improve the accuracy of this algorithm. This is done by evaluating the inferred action using an additional forward dynamics model. Specifically, we focus on the consistency in both dynamics models, which have to be bi-directional. This bi-directionality, can classify whether the inferred action is realistic or not, and can prevent wrong updates. We show the successful improvement with our new method using various simulation tasks which are typically used in benchmarks.","PeriodicalId":395378,"journal":{"name":"2021 IEEE/SICE International Symposium on System Integration (SII)","volume":"118 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/SICE International Symposium on System Integration (SII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEECONF49454.2021.9382751","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Robotics is rapidly expanding its workplace from industrial factories to more complicated fields to work on behalf of human. The difficulty of programing the operations human does in advance, however, prevents this expansion. Behavioral cloning is one of the promising approaches to acquire the operations effectively from the expert’s demonstrations, which consist of the states and the performed actions of the expert. However, it is intractable and/or highly expensive for robots to measure the expert’s actions. Behavioral cloning from observation fills this gap and makes it possible to imitate with state-only demonstrations by inferring actions that the expert performed from an inverse dynamics model. Our goal is to improve the accuracy of this algorithm. This is done by evaluating the inferred action using an additional forward dynamics model. Specifically, we focus on the consistency in both dynamics models, which have to be bi-directional. This bi-directionality, can classify whether the inferred action is realistic or not, and can prevent wrong updates. We show the successful improvement with our new method using various simulation tasks which are typically used in benchmarks.

查看原文本刊更多论文

基于双向动力学模型的观察行为克隆

机器人正在迅速将其工作场所从工业工厂扩展到更复杂的领域，以代表人类工作。然而，事先对人类操作进行编程的难度阻碍了这种扩展。行为克隆是一种很有前途的方法，可以有效地从专家的演示中获取操作，该演示由专家的状态和已完成的动作组成。然而，对于机器人来说，测量专家的行为是非常棘手和/或昂贵的。基于观察的行为克隆填补了这一空白，通过推断专家从逆动力学模型中执行的动作，可以用仅状态演示进行模仿。我们的目标是提高算法的准确性。这是通过使用额外的前向动态模型评估推断的动作来完成的。具体来说，我们关注的是两种动态模型的一致性，这必须是双向的。这种双向性可以对推断的动作是否真实进行分类，并可以防止错误的更新。我们使用各种通常在基准测试中使用的模拟任务来展示我们的新方法的成功改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE/SICE International Symposium on System Integration (SII)

自引率

0.00%

发文量