PLATO:利用 LLM 和 Affordances 进行工具操作规划

Arvind Car, Sai Sravan Yarlagadda, Alison Bartsch, Abraham George, Amir Barati Farimani
{"title":"PLATO:利用 LLM 和 Affordances 进行工具操作规划","authors":"Arvind Car, Sai Sravan Yarlagadda, Alison Bartsch, Abraham George, Amir Barati Farimani","doi":"arxiv-2409.11580","DOIUrl":null,"url":null,"abstract":"As robotic systems become increasingly integrated into complex real-world\nenvironments, there is a growing need for approaches that enable robots to\nunderstand and act upon natural language instructions without relying on\nextensive pre-programmed knowledge of their surroundings. This paper presents\nPLATO, an innovative system that addresses this challenge by leveraging\nspecialized large language model agents to process natural language inputs,\nunderstand the environment, predict tool affordances, and generate executable\nactions for robotic systems. Unlike traditional systems that depend on\nhard-coded environmental information, PLATO employs a modular architecture of\nspecialized agents to operate without any initial knowledge of the environment.\nThese agents identify objects and their locations within the scene, generate a\ncomprehensive high-level plan, translate this plan into a series of low-level\nactions, and verify the completion of each step. The system is particularly\ntested on challenging tool-use tasks, which involve handling diverse objects\nand require long-horizon planning. PLATO's design allows it to adapt to dynamic\nand unstructured settings, significantly enhancing its flexibility and\nrobustness. By evaluating the system across various complex scenarios, we\ndemonstrate its capability to tackle a diverse range of tasks and offer a novel\nsolution to integrate LLMs with robotic platforms, advancing the\nstate-of-the-art in autonomous robotic task execution. For videos and prompt\ndetails, please see our project website:\nhttps://sites.google.com/andrew.cmu.edu/plato","PeriodicalId":501031,"journal":{"name":"arXiv - CS - Robotics","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PLATO: Planning with LLMs and Affordances for Tool Manipulation\",\"authors\":\"Arvind Car, Sai Sravan Yarlagadda, Alison Bartsch, Abraham George, Amir Barati Farimani\",\"doi\":\"arxiv-2409.11580\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As robotic systems become increasingly integrated into complex real-world\\nenvironments, there is a growing need for approaches that enable robots to\\nunderstand and act upon natural language instructions without relying on\\nextensive pre-programmed knowledge of their surroundings. This paper presents\\nPLATO, an innovative system that addresses this challenge by leveraging\\nspecialized large language model agents to process natural language inputs,\\nunderstand the environment, predict tool affordances, and generate executable\\nactions for robotic systems. Unlike traditional systems that depend on\\nhard-coded environmental information, PLATO employs a modular architecture of\\nspecialized agents to operate without any initial knowledge of the environment.\\nThese agents identify objects and their locations within the scene, generate a\\ncomprehensive high-level plan, translate this plan into a series of low-level\\nactions, and verify the completion of each step. The system is particularly\\ntested on challenging tool-use tasks, which involve handling diverse objects\\nand require long-horizon planning. PLATO's design allows it to adapt to dynamic\\nand unstructured settings, significantly enhancing its flexibility and\\nrobustness. By evaluating the system across various complex scenarios, we\\ndemonstrate its capability to tackle a diverse range of tasks and offer a novel\\nsolution to integrate LLMs with robotic platforms, advancing the\\nstate-of-the-art in autonomous robotic task execution. For videos and prompt\\ndetails, please see our project website:\\nhttps://sites.google.com/andrew.cmu.edu/plato\",\"PeriodicalId\":501031,\"journal\":{\"name\":\"arXiv - CS - Robotics\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Robotics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11580\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Robotics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11580","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着机器人系统越来越多地融入复杂的现实世界环境,人们越来越需要能让机器人理解自然语言指令并根据指令行动的方法,而无需依赖对周围环境的大量预编程知识。本文介绍了一种创新系统--PLATO,它利用专门的大型语言模型代理来处理自然语言输入、理解环境、预测工具承受能力,并为机器人系统生成可执行的动作,从而应对这一挑战。与依赖硬编码环境信息的传统系统不同,PLATO 采用了由专业代理组成的模块化架构,无需任何初始环境知识即可运行。这些代理可识别场景中的物体及其位置,生成全面的高级计划,将该计划转化为一系列低级动作,并验证每个步骤的完成情况。该系统特别在具有挑战性的工具使用任务中进行了测试,这些任务涉及处理各种不同的物体,需要进行长远规划。PLATO的设计使其能够适应动态和非结构化的环境,大大提高了灵活性和稳健性。通过在各种复杂场景中对该系统进行评估,我们展示了该系统处理各种任务的能力,并提供了将 LLM 与机器人平台集成的新型解决方案,从而推动了自主机器人任务执行技术的发展。有关视频和提示详情,请访问我们的项目网站:https://sites.google.com/andrew.cmu.edu/plato。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
PLATO: Planning with LLMs and Affordances for Tool Manipulation
As robotic systems become increasingly integrated into complex real-world environments, there is a growing need for approaches that enable robots to understand and act upon natural language instructions without relying on extensive pre-programmed knowledge of their surroundings. This paper presents PLATO, an innovative system that addresses this challenge by leveraging specialized large language model agents to process natural language inputs, understand the environment, predict tool affordances, and generate executable actions for robotic systems. Unlike traditional systems that depend on hard-coded environmental information, PLATO employs a modular architecture of specialized agents to operate without any initial knowledge of the environment. These agents identify objects and their locations within the scene, generate a comprehensive high-level plan, translate this plan into a series of low-level actions, and verify the completion of each step. The system is particularly tested on challenging tool-use tasks, which involve handling diverse objects and require long-horizon planning. PLATO's design allows it to adapt to dynamic and unstructured settings, significantly enhancing its flexibility and robustness. By evaluating the system across various complex scenarios, we demonstrate its capability to tackle a diverse range of tasks and offer a novel solution to integrate LLMs with robotic platforms, advancing the state-of-the-art in autonomous robotic task execution. For videos and prompt details, please see our project website: https://sites.google.com/andrew.cmu.edu/plato
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信