{"title":"基于语言教学的机器人任务规划分层子目标生成","authors":"Zejun Yang, Li Ning, Hao Jiang, Zhaoqi Wang","doi":"10.1109/CAC57257.2022.10054939","DOIUrl":null,"url":null,"abstract":"In the field of \"Visual-Language Task Planning\", how to flexibly process various language inputs determines the accuracy of action prediction. In this paper, we mainly study: what useful information can be obtained from language inputs and how to organize it; how to map instructions to the organized information. For these problems, (1) we organize the \"Task-Subgoal-Action\" logic structure, to help complete the complex tasks step-by-step, (2) we also propose the hierarchical subgoal generation model, which learns operation knowledge from training data, to generate executable subgoal sequences according to the given instructions. Our model is trained and evaluated with the datasets from “Action Learning From Realistic Environments and Directives ”(ALFRED), the subgoal sequences extracted from predicted texts successfully enable the robot to complete nearly 97% of ALFRED tasks\" What’s more, our model performs better than the language processing module in FILM, and the robot system which integrates our model performs well. These results indicate that our model can make full use of language inputs efficiently and provides great help to robot task planning.","PeriodicalId":287137,"journal":{"name":"2022 China Automation Congress (CAC)","volume":"267 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Hierarchical Subgoal Generation from Language Instruction for Robot Task Planning\",\"authors\":\"Zejun Yang, Li Ning, Hao Jiang, Zhaoqi Wang\",\"doi\":\"10.1109/CAC57257.2022.10054939\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In the field of \\\"Visual-Language Task Planning\\\", how to flexibly process various language inputs determines the accuracy of action prediction. In this paper, we mainly study: what useful information can be obtained from language inputs and how to organize it; how to map instructions to the organized information. For these problems, (1) we organize the \\\"Task-Subgoal-Action\\\" logic structure, to help complete the complex tasks step-by-step, (2) we also propose the hierarchical subgoal generation model, which learns operation knowledge from training data, to generate executable subgoal sequences according to the given instructions. Our model is trained and evaluated with the datasets from “Action Learning From Realistic Environments and Directives ”(ALFRED), the subgoal sequences extracted from predicted texts successfully enable the robot to complete nearly 97% of ALFRED tasks\\\" What’s more, our model performs better than the language processing module in FILM, and the robot system which integrates our model performs well. These results indicate that our model can make full use of language inputs efficiently and provides great help to robot task planning.\",\"PeriodicalId\":287137,\"journal\":{\"name\":\"2022 China Automation Congress (CAC)\",\"volume\":\"267 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-11-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 China Automation Congress (CAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CAC57257.2022.10054939\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 China Automation Congress (CAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAC57257.2022.10054939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hierarchical Subgoal Generation from Language Instruction for Robot Task Planning
In the field of "Visual-Language Task Planning", how to flexibly process various language inputs determines the accuracy of action prediction. In this paper, we mainly study: what useful information can be obtained from language inputs and how to organize it; how to map instructions to the organized information. For these problems, (1) we organize the "Task-Subgoal-Action" logic structure, to help complete the complex tasks step-by-step, (2) we also propose the hierarchical subgoal generation model, which learns operation knowledge from training data, to generate executable subgoal sequences according to the given instructions. Our model is trained and evaluated with the datasets from “Action Learning From Realistic Environments and Directives ”(ALFRED), the subgoal sequences extracted from predicted texts successfully enable the robot to complete nearly 97% of ALFRED tasks" What’s more, our model performs better than the language processing module in FILM, and the robot system which integrates our model performs well. These results indicate that our model can make full use of language inputs efficiently and provides great help to robot task planning.