Hierarchical Subgoal Generation from Language Instruction for Robot Task Planning

2022 China Automation Congress (CAC) Pub Date : 2022-11-25 DOI:10.1109/CAC57257.2022.10054939

Zejun Yang, Li Ning, Hao Jiang, Zhaoqi Wang

{"title":"Hierarchical Subgoal Generation from Language Instruction for Robot Task Planning","authors":"Zejun Yang, Li Ning, Hao Jiang, Zhaoqi Wang","doi":"10.1109/CAC57257.2022.10054939","DOIUrl":null,"url":null,"abstract":"In the field of \"Visual-Language Task Planning\", how to flexibly process various language inputs determines the accuracy of action prediction. In this paper, we mainly study: what useful information can be obtained from language inputs and how to organize it; how to map instructions to the organized information. For these problems, (1) we organize the \"Task-Subgoal-Action\" logic structure, to help complete the complex tasks step-by-step, (2) we also propose the hierarchical subgoal generation model, which learns operation knowledge from training data, to generate executable subgoal sequences according to the given instructions. Our model is trained and evaluated with the datasets from “Action Learning From Realistic Environments and Directives ”(ALFRED), the subgoal sequences extracted from predicted texts successfully enable the robot to complete nearly 97% of ALFRED tasks\" What’s more, our model performs better than the language processing module in FILM, and the robot system which integrates our model performs well. These results indicate that our model can make full use of language inputs efficiently and provides great help to robot task planning.","PeriodicalId":287137,"journal":{"name":"2022 China Automation Congress (CAC)","volume":"267 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 China Automation Congress (CAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CAC57257.2022.10054939","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

In the field of "Visual-Language Task Planning", how to flexibly process various language inputs determines the accuracy of action prediction. In this paper, we mainly study: what useful information can be obtained from language inputs and how to organize it; how to map instructions to the organized information. For these problems, (1) we organize the "Task-Subgoal-Action" logic structure, to help complete the complex tasks step-by-step, (2) we also propose the hierarchical subgoal generation model, which learns operation knowledge from training data, to generate executable subgoal sequences according to the given instructions. Our model is trained and evaluated with the datasets from “Action Learning From Realistic Environments and Directives ”(ALFRED), the subgoal sequences extracted from predicted texts successfully enable the robot to complete nearly 97% of ALFRED tasks" What’s more, our model performs better than the language processing module in FILM, and the robot system which integrates our model performs well. These results indicate that our model can make full use of language inputs efficiently and provides great help to robot task planning.

查看原文本刊更多论文

基于语言教学的机器人任务规划分层子目标生成

在“视觉语言任务规划”领域，如何灵活处理各种语言输入决定了动作预测的准确性。在本文中，我们主要研究:从语言输入中可以获得哪些有用的信息以及如何组织这些信息;如何将指令映射到组织好的信息。针对这些问题，(1)我们组织了“任务-子目标-行动”的逻辑结构，帮助逐步完成复杂的任务;(2)我们还提出了分层次的子目标生成模型，该模型从训练数据中学习操作知识，根据给定的指令生成可执行的子目标序列。我们的模型使用来自“现实环境和指令的行动学习”(ALFRED)的数据集进行训练和评估，从预测文本中提取的子目标序列成功地使机器人完成了近97%的ALFRED任务。此外，我们的模型比FILM中的语言处理模块性能更好，并且集成我们模型的机器人系统性能良好。结果表明，该模型可以有效地充分利用语言输入，为机器人的任务规划提供了很大的帮助。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 China Automation Congress (CAC)

自引率

0.00%

发文量