用生成对抗模仿学习从原始图像中学习食物安排策略

2020 17th International Conference on Ubiquitous Robots (UR) Pub Date : 2020-06-01 DOI:10.1109/UR49135.2020.9144988

Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura

{"title":"用生成对抗模仿学习从原始图像中学习食物安排策略","authors":"Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura","doi":"10.1109/UR49135.2020.9144988","DOIUrl":null,"url":null,"abstract":"In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine \"Tempura\" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.","PeriodicalId":360208,"journal":{"name":"2020 17th International Conference on Ubiquitous Robots (UR)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Learning Food-arrangement Policies from Raw Images with Generative Adversarial Imitation Learning\",\"authors\":\"Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura\",\"doi\":\"10.1109/UR49135.2020.9144988\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine \\\"Tempura\\\" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.\",\"PeriodicalId\":360208,\"journal\":{\"name\":\"2020 17th International Conference on Ubiquitous Robots (UR)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 17th International Conference on Ubiquitous Robots (UR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UR49135.2020.9144988\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 17th International Conference on Ubiquitous Robots (UR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UR49135.2020.9144988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 7

摘要

在本文中，我们用专家示范的模仿学习方法来解决食物安排计划的问题。具体来说，我们利用生成对抗模仿学习框架，该框架允许智能体在没有明确奖励函数的情况下，从一些专家演示和自我探索中学习接近最优的行为。为了验证我们的方法，利用3d扫描的天妇罗原料，开发了日本料理“天妇罗”的食物排列模拟器，并对其性能进行了实验评估。实验结果表明，该方法可以在不需要人工设计奖励函数和大量专家演示数据的情况下，从鸟瞰图原始图像中学习到类似专家的排列策略。此外，我们证实，与具有监督学习的基线策略相比，学习策略对安排错误和环境变化具有鲁棒性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Learning Food-arrangement Policies from Raw Images with Generative Adversarial Imitation Learning

In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine "Tempura" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 17th International Conference on Ubiquitous Robots (UR)

自引率

0.00%

发文量