{"title":"基于三维空间对象关系和语言指令的语义场景处理","authors":"Rainer Kartmann, Danqing Liu, T. Asfour","doi":"10.1109/HUMANOIDS47582.2021.9555802","DOIUrl":null,"url":null,"abstract":"Robot understanding of spatial object relations is key for a symbiotic human-robot interaction. Understanding the meaning of such relations between objects in a current scene and target relations specified in natural language commands is essential for the generation of robot manipulation action goals to change the scene by relocating objects relative to each other to fulfill the desired spatial relations. This ability requires a representation of spatial relations, which maps spatial relation symbols extracted from language instructions to subsymbolic object goal locations in the world. We present a generative model of static and dynamic 3D spatial relations between multiple reference objects. The model is based on a parametric probability distribution defined in cylindrical coordinates and is learned from examples provided by humans manipulating a scene in the real world. We demonstrate the ability of our representation to generate suitable object goal positions for a pick-and-place task on a humanoid robot, where object relations specified in natural language commands are extracted, object goal positions are determined and used for parametrizing the actions needed to transfer a given scene into a new one that fulfills the specified relations.","PeriodicalId":320510,"journal":{"name":"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Semantic Scene Manipulation Based on 3D Spatial Object Relations and Language Instructions\",\"authors\":\"Rainer Kartmann, Danqing Liu, T. Asfour\",\"doi\":\"10.1109/HUMANOIDS47582.2021.9555802\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Robot understanding of spatial object relations is key for a symbiotic human-robot interaction. Understanding the meaning of such relations between objects in a current scene and target relations specified in natural language commands is essential for the generation of robot manipulation action goals to change the scene by relocating objects relative to each other to fulfill the desired spatial relations. This ability requires a representation of spatial relations, which maps spatial relation symbols extracted from language instructions to subsymbolic object goal locations in the world. We present a generative model of static and dynamic 3D spatial relations between multiple reference objects. The model is based on a parametric probability distribution defined in cylindrical coordinates and is learned from examples provided by humans manipulating a scene in the real world. We demonstrate the ability of our representation to generate suitable object goal positions for a pick-and-place task on a humanoid robot, where object relations specified in natural language commands are extracted, object goal positions are determined and used for parametrizing the actions needed to transfer a given scene into a new one that fulfills the specified relations.\",\"PeriodicalId\":320510,\"journal\":{\"name\":\"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HUMANOIDS47582.2021.9555802\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE-RAS 20th International Conference on Humanoid Robots (Humanoids)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HUMANOIDS47582.2021.9555802","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Semantic Scene Manipulation Based on 3D Spatial Object Relations and Language Instructions
Robot understanding of spatial object relations is key for a symbiotic human-robot interaction. Understanding the meaning of such relations between objects in a current scene and target relations specified in natural language commands is essential for the generation of robot manipulation action goals to change the scene by relocating objects relative to each other to fulfill the desired spatial relations. This ability requires a representation of spatial relations, which maps spatial relation symbols extracted from language instructions to subsymbolic object goal locations in the world. We present a generative model of static and dynamic 3D spatial relations between multiple reference objects. The model is based on a parametric probability distribution defined in cylindrical coordinates and is learned from examples provided by humans manipulating a scene in the real world. We demonstrate the ability of our representation to generate suitable object goal positions for a pick-and-place task on a humanoid robot, where object relations specified in natural language commands are extracted, object goal positions are determined and used for parametrizing the actions needed to transfer a given scene into a new one that fulfills the specified relations.