服务机器人应用中使用深度生成模型的对象图像和位置的双向生成

2021 IEEE/SICE International Symposium on System Integration (SII) Pub Date : 2021-01-11 DOI:10.1109/IEEECONF49454.2021.9382768

Kaede Hayashi, Wenru Zheng, Lotfi El Hafi, Y. Hagiwara, T. Taniguchi

{"title":"服务机器人应用中使用深度生成模型的对象图像和位置的双向生成","authors":"Kaede Hayashi, Wenru Zheng, Lotfi El Hafi, Y. Hagiwara, T. Taniguchi","doi":"10.1109/IEEECONF49454.2021.9382768","DOIUrl":null,"url":null,"abstract":"The introduction of systems and robots for automated services is important for reducing running costs and improving operational efficiency in the retail industry. To this aim, we develop a system that enables robot agents to display products in stores. The main problem in automating product display using common supervised methods with robot agents is the huge amount of data required to recognize product categories and arrangements in a variety of different store layouts. To solve this problem, we propose a crossmodal inference system based on joint multimodal variational autoencoder (JMVAE) that learns the relationship between object image information and location information observed on site by robot agents. In our experiments, we created a simulation environment replicating a convenience store that allows a robot agent to observe an object image and its 3D coordinate information, and confirmed whether JMVAE can learn and generate a shared representation of an object image and 3D coordinates in a bidirectional manner.","PeriodicalId":395378,"journal":{"name":"2021 IEEE/SICE International Symposium on System Integration (SII)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Bidirectional Generation of Object Images and Positions using Deep Generative Models for Service Robotics Applications\",\"authors\":\"Kaede Hayashi, Wenru Zheng, Lotfi El Hafi, Y. Hagiwara, T. Taniguchi\",\"doi\":\"10.1109/IEEECONF49454.2021.9382768\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The introduction of systems and robots for automated services is important for reducing running costs and improving operational efficiency in the retail industry. To this aim, we develop a system that enables robot agents to display products in stores. The main problem in automating product display using common supervised methods with robot agents is the huge amount of data required to recognize product categories and arrangements in a variety of different store layouts. To solve this problem, we propose a crossmodal inference system based on joint multimodal variational autoencoder (JMVAE) that learns the relationship between object image information and location information observed on site by robot agents. In our experiments, we created a simulation environment replicating a convenience store that allows a robot agent to observe an object image and its 3D coordinate information, and confirmed whether JMVAE can learn and generate a shared representation of an object image and 3D coordinates in a bidirectional manner.\",\"PeriodicalId\":395378,\"journal\":{\"name\":\"2021 IEEE/SICE International Symposium on System Integration (SII)\",\"volume\":\"34 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-01-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/SICE International Symposium on System Integration (SII)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IEEECONF49454.2021.9382768\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/SICE International Symposium on System Integration (SII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IEEECONF49454.2021.9382768","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

引入自动化服务系统和机器人对于降低零售行业的运营成本和提高运营效率非常重要。为此，我们开发了一个系统，使机器人代理能够在商店中展示产品。使用机器人代理的常见监督方法自动化产品展示的主要问题是，在各种不同的商店布局中识别产品类别和安排需要大量的数据。为了解决这一问题，我们提出了一种基于联合多模态变分自编码器(JMVAE)的跨模推理系统，该系统学习机器人代理在现场观察到的目标图像信息与位置信息之间的关系。在我们的实验中，我们创建了一个复制便利店的仿真环境，允许机器人代理观察物体图像及其三维坐标信息，并验证了JMVAE是否能够以双向的方式学习和生成物体图像和三维坐标的共享表示。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Bidirectional Generation of Object Images and Positions using Deep Generative Models for Service Robotics Applications

The introduction of systems and robots for automated services is important for reducing running costs and improving operational efficiency in the retail industry. To this aim, we develop a system that enables robot agents to display products in stores. The main problem in automating product display using common supervised methods with robot agents is the huge amount of data required to recognize product categories and arrangements in a variety of different store layouts. To solve this problem, we propose a crossmodal inference system based on joint multimodal variational autoencoder (JMVAE) that learns the relationship between object image information and location information observed on site by robot agents. In our experiments, we created a simulation environment replicating a convenience store that allows a robot agent to observe an object image and its 3D coordinate information, and confirmed whether JMVAE can learn and generate a shared representation of an object image and 3D coordinates in a bidirectional manner.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2021 IEEE/SICE International Symposium on System Integration (SII)

自引率

0.00%

发文量