Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura
{"title":"用生成对抗模仿学习从原始图像中学习食物安排策略","authors":"Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura","doi":"10.1109/UR49135.2020.9144988","DOIUrl":null,"url":null,"abstract":"In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine \"Tempura\" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.","PeriodicalId":360208,"journal":{"name":"2020 17th International Conference on Ubiquitous Robots (UR)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":"{\"title\":\"Learning Food-arrangement Policies from Raw Images with Generative Adversarial Imitation Learning\",\"authors\":\"Junki Matsuoka, Yoshihisa Tsurumine, Yuhwan Kwon, Takamitsu Matsubara, T. Shimmura, S. Kawamura\",\"doi\":\"10.1109/UR49135.2020.9144988\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine \\\"Tempura\\\" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.\",\"PeriodicalId\":360208,\"journal\":{\"name\":\"2020 17th International Conference on Ubiquitous Robots (UR)\",\"volume\":\"38 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"7\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 17th International Conference on Ubiquitous Robots (UR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/UR49135.2020.9144988\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 17th International Conference on Ubiquitous Robots (UR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UR49135.2020.9144988","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Learning Food-arrangement Policies from Raw Images with Generative Adversarial Imitation Learning
In this paper, we tackle the problem of food- arrangement planning by an imitation learning approach from expert demonstrations. Specifically, we utilize a Generative Adversarial Imitation Learning framework, which allows an agent to learn near-optimal behaviors from a few expert demonstrations and self explorations without an explicit reward function. To evaluate our method, a food-arrangement simulator for the Japanese cuisine "Tempura" was developed with 3D-scanned tempura ingredients, and experiments were conducted for its performance evaluation. The experimental results demonstrate that our method can learn expert-like arrangement policies from bird-view raw images of plates without manually designing a reward function or requiring a massive number of expert demonstration data. Moreover, we confirmed that the learned polices are robust against arrangement errors and environmental changes compared to a baseline policy with supervised learning.