Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura
{"title":"Learning Food-arrangement Policies from Raw Images with Generative Adversarial Imitation Learning","authors":"Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura","doi":"10.1109/UR49135.2020.9144988","DOIUrl":null,"url":null,"abstract":"In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine \"Tempura\" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.","PeriodicalId":360208,"journal":{"name":"2020 17th International Conference on Ubiquitous Robots (UR)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 17th International Conference on Ubiquitous Robots (UR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UR49135.2020.9144988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine "Tempura" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.