Learning Food-arrangement Policies from Raw Images with Generative Adversarial Imitation Learning

2020 17th International Conference on Ubiquitous Robots (UR) Pub Date : 2020-06-01 DOI:10.1109/UR49135.2020.9144988

Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura

引用次数: 7

Abstract

In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine "Tempura" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.

查看原文本刊更多论文

用生成对抗模仿学习从原始图像中学习食物安排策略

在本文中，我们用专家示范的模仿学习方法来解决食物安排计划的问题。具体来说，我们利用生成对抗模仿学习框架，该框架允许智能体在没有明确奖励函数的情况下，从一些专家演示和自我探索中学习接近最优的行为。为了验证我们的方法，利用3d扫描的天妇罗原料，开发了日本料理“天妇罗”的食物排列模拟器，并对其性能进行了实验评估。实验结果表明，该方法可以在不需要人工设计奖励函数和大量专家演示数据的情况下，从鸟瞰图原始图像中学习到类似专家的排列策略。此外，我们证实，与具有监督学习的基线策略相比，学习策略对安排错误和环境变化具有鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 17th International Conference on Ubiquitous Robots (UR)

自引率

0.00%

发文量