Embodied Reasoning for Discovering Object Properties via Manipulation

J. Behrens, Michal Nazarczuk, K. Štěpánová, M. Hoffmann, Y. Demiris, K. Mikolajczyk
{"title":"Embodied Reasoning for Discovering Object Properties via Manipulation","authors":"J. Behrens, Michal Nazarczuk, K. Štěpánová, M. Hoffmann, Y. Demiris, K. Mikolajczyk","doi":"10.1109/ICRA48506.2021.9561212","DOIUrl":null,"url":null,"abstract":"In this paper, we present an integrated system that includes reasoning from visual and natural language inputs, action and motion planning, executing tasks by a robotic arm, manipulating objects, and discovering their properties. A vision to action module recognises the scene with objects and their attributes and analyses enquiries formulated in natural language. It performs multi-modal reasoning and generates a sequence of simple actions that can be executed by a robot. The scene model and action sequence are sent to a planning and execution module that generates a motion plan with collision avoidance, simulates the actions, and executes them. We use synthetic data to train various components of the system and test on a real robot to show the generalization capabilities. We focus on a tabletop scenario with objects that can be grasped by our embodied agent i.e. a 7DoF manipulator with a two-finger gripper. We evaluate the agent on 60 representative queries repeated 3 times (e.g., ’Check what is on the other side of the soda can’) concerning different objects and tasks in the scene. We perform experiments in a simulated and real environment and report the success rate for various components of the system. Our system achieves up to 80.6% success rate on challenging scenes and queries. We also analyse and discuss the challenges that such an intelligent embodied system faces.","PeriodicalId":108312,"journal":{"name":"2021 IEEE International Conference on Robotics and Automation (ICRA)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Robotics and Automation (ICRA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICRA48506.2021.9561212","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

In this paper, we present an integrated system that includes reasoning from visual and natural language inputs, action and motion planning, executing tasks by a robotic arm, manipulating objects, and discovering their properties. A vision to action module recognises the scene with objects and their attributes and analyses enquiries formulated in natural language. It performs multi-modal reasoning and generates a sequence of simple actions that can be executed by a robot. The scene model and action sequence are sent to a planning and execution module that generates a motion plan with collision avoidance, simulates the actions, and executes them. We use synthetic data to train various components of the system and test on a real robot to show the generalization capabilities. We focus on a tabletop scenario with objects that can be grasped by our embodied agent i.e. a 7DoF manipulator with a two-finger gripper. We evaluate the agent on 60 representative queries repeated 3 times (e.g., ’Check what is on the other side of the soda can’) concerning different objects and tasks in the scene. We perform experiments in a simulated and real environment and report the success rate for various components of the system. Our system achieves up to 80.6% success rate on challenging scenes and queries. We also analyse and discuss the challenges that such an intelligent embodied system faces.
通过操作发现对象属性的具身推理
在本文中,我们提出了一个集成系统,包括从视觉和自然语言输入进行推理,动作和运动规划,通过机械臂执行任务,操纵物体以及发现它们的属性。一个从视觉到行动的模块可以识别带有物体及其属性的场景,并分析用自然语言表述的查询。它执行多模态推理,并生成一系列可以由机器人执行的简单动作。将场景模型和动作序列发送到规划执行模块,规划执行模块生成避碰运动计划,模拟动作并执行。我们使用合成数据来训练系统的各个组成部分,并在真实机器人上进行测试,以显示系统的泛化能力。我们关注的是一个桌面场景,其中的物体可以被我们的具体化代理(即带有两指抓取器的7DoF机械手)抓取。我们对场景中不同对象和任务的60个代表性查询(例如,“检查汽水罐的另一边是什么”)进行评估。我们在模拟和真实环境中进行了实验,并报告了系统各组件的成功率。我们的系统在具有挑战性的场景和查询上达到了80.6%的成功率。我们还分析和讨论了这种智能具体化系统所面临的挑战。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信