{"title":"为在日常环境中运行的机器人大规模学习实例级n元语义知识","authors":"Weiyu Liu, Dhruva Bansal, Angel Daruna, Sonia Chernova","doi":"10.1007/s10514-023-10099-4","DOIUrl":null,"url":null,"abstract":"<div><p>Robots operating in everyday environments need to effectively perceive, model, and infer semantic properties of objects. Existing knowledge reasoning frameworks only model binary relations between an object’s class label and its semantic properties, unable to collectively reason about object properties detected by different perception algorithms and grounded in diverse sensory modalities. We bridge the gap between multimodal perception and knowledge reasoning by introducing an n-ary representation that models complex, inter-related object properties. To tackle the problem of collecting n-ary semantic knowledge at scale, we propose transformer neural networks that generalize knowledge from observations of object instances by learning to predict single missing properties or predict joint probabilities of all properties. The learned models can reason at different levels of abstraction, effectively predicting unknown properties of objects in different environmental contexts given different amounts of observed information. We quantitatively validate our approach against prior methods on LINK, a unique dataset we contribute that contains 1457 object instances in different situations, amounting to 15 multimodal properties types and 200 total properties. Compared to the top-performing baseline, a Markov Logic Network, our models obtain a 10% improvement in predicting unknown properties of novel object instances while reducing training and inference time by more than 150 times. Additionally, we apply our work to a mobile manipulation robot, demonstrating its ability to leverage n-ary reasoning to retrieve objects and actively detect object properties. The code and data are available at https://github.com/wliu88/LINK.</p></div>","PeriodicalId":55409,"journal":{"name":"Autonomous Robots","volume":"47 5","pages":"529 - 547"},"PeriodicalIF":3.7000,"publicationDate":"2023-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Learning instance-level N-ary semantic knowledge at scale for robots operating in everyday environments\",\"authors\":\"Weiyu Liu, Dhruva Bansal, Angel Daruna, Sonia Chernova\",\"doi\":\"10.1007/s10514-023-10099-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Robots operating in everyday environments need to effectively perceive, model, and infer semantic properties of objects. Existing knowledge reasoning frameworks only model binary relations between an object’s class label and its semantic properties, unable to collectively reason about object properties detected by different perception algorithms and grounded in diverse sensory modalities. We bridge the gap between multimodal perception and knowledge reasoning by introducing an n-ary representation that models complex, inter-related object properties. To tackle the problem of collecting n-ary semantic knowledge at scale, we propose transformer neural networks that generalize knowledge from observations of object instances by learning to predict single missing properties or predict joint probabilities of all properties. The learned models can reason at different levels of abstraction, effectively predicting unknown properties of objects in different environmental contexts given different amounts of observed information. We quantitatively validate our approach against prior methods on LINK, a unique dataset we contribute that contains 1457 object instances in different situations, amounting to 15 multimodal properties types and 200 total properties. Compared to the top-performing baseline, a Markov Logic Network, our models obtain a 10% improvement in predicting unknown properties of novel object instances while reducing training and inference time by more than 150 times. Additionally, we apply our work to a mobile manipulation robot, demonstrating its ability to leverage n-ary reasoning to retrieve objects and actively detect object properties. The code and data are available at https://github.com/wliu88/LINK.</p></div>\",\"PeriodicalId\":55409,\"journal\":{\"name\":\"Autonomous Robots\",\"volume\":\"47 5\",\"pages\":\"529 - 547\"},\"PeriodicalIF\":3.7000,\"publicationDate\":\"2023-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Autonomous Robots\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://link.springer.com/article/10.1007/s10514-023-10099-4\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Autonomous Robots","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10514-023-10099-4","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
Learning instance-level N-ary semantic knowledge at scale for robots operating in everyday environments
Robots operating in everyday environments need to effectively perceive, model, and infer semantic properties of objects. Existing knowledge reasoning frameworks only model binary relations between an object’s class label and its semantic properties, unable to collectively reason about object properties detected by different perception algorithms and grounded in diverse sensory modalities. We bridge the gap between multimodal perception and knowledge reasoning by introducing an n-ary representation that models complex, inter-related object properties. To tackle the problem of collecting n-ary semantic knowledge at scale, we propose transformer neural networks that generalize knowledge from observations of object instances by learning to predict single missing properties or predict joint probabilities of all properties. The learned models can reason at different levels of abstraction, effectively predicting unknown properties of objects in different environmental contexts given different amounts of observed information. We quantitatively validate our approach against prior methods on LINK, a unique dataset we contribute that contains 1457 object instances in different situations, amounting to 15 multimodal properties types and 200 total properties. Compared to the top-performing baseline, a Markov Logic Network, our models obtain a 10% improvement in predicting unknown properties of novel object instances while reducing training and inference time by more than 150 times. Additionally, we apply our work to a mobile manipulation robot, demonstrating its ability to leverage n-ary reasoning to retrieve objects and actively detect object properties. The code and data are available at https://github.com/wliu88/LINK.
期刊介绍:
Autonomous Robots reports on the theory and applications of robotic systems capable of some degree of self-sufficiency. It features papers that include performance data on actual robots in the real world. Coverage includes: control of autonomous robots · real-time vision · autonomous wheeled and tracked vehicles · legged vehicles · computational architectures for autonomous systems · distributed architectures for learning, control and adaptation · studies of autonomous robot systems · sensor fusion · theory of autonomous systems · terrain mapping and recognition · self-calibration and self-repair for robots · self-reproducing intelligent structures · genetic algorithms as models for robot development.
The focus is on the ability to move and be self-sufficient, not on whether the system is an imitation of biology. Of course, biological models for robotic systems are of major interest to the journal since living systems are prototypes for autonomous behavior.