Towards Autonomous Reinforcement Learning for Real-World Robotic Manipulation With Large Language Models

IF 4.6 2区 计算机科学 Q2 ROBOTICS
Niccolò Turcato;Matteo Iovino;Aris Synodinos;Alberto Dalla Libera;Ruggero Carli;Pietro Falco
{"title":"Towards Autonomous Reinforcement Learning for Real-World Robotic Manipulation With Large Language Models","authors":"Niccolò Turcato;Matteo Iovino;Aris Synodinos;Alberto Dalla Libera;Ruggero Carli;Pietro Falco","doi":"10.1109/LRA.2025.3589162","DOIUrl":null,"url":null,"abstract":"Recent advancements in Large Language Models (LLMs) and Visual Language Models (VLMs) have significantly impacted robotics, enabling high-level semantic motion planning applications. Reinforcement Learning (RL), a complementary paradigm, enables agents to autonomously optimize complex behaviors through interaction and reward signals. However, designing effective reward functions for RL remains challenging, especially in real-world tasks where sparse rewards are insufficient and dense rewards require elaborate design. In this work, we propose <italic>Autonomous Reinforcement learning for Complex Human-Informed Environments</i> (<italic>ARCHIE</i>), an unsupervised pipeline leveraging GPT-4, a pre-trained LLM, to generate reward functions directly from natural language task descriptions. The rewards are used to train RL agents in simulated environments, where we formalize the reward generation process to enhance feasibility. Additionally, GPT-4 automates the coding of task success criteria, creating a fully automated, one-shot procedure for translating human-readable text into deployable robot skills. Our approach is validated through extensive simulated experiments on single-arm and bi-manual manipulation tasks using an ABB YuMi collaborative robot, highlighting its practicality and effectiveness. Tasks are demonstrated on the real robot setup.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 9","pages":"8850-8857"},"PeriodicalIF":4.6000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11080043/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0

Abstract

Recent advancements in Large Language Models (LLMs) and Visual Language Models (VLMs) have significantly impacted robotics, enabling high-level semantic motion planning applications. Reinforcement Learning (RL), a complementary paradigm, enables agents to autonomously optimize complex behaviors through interaction and reward signals. However, designing effective reward functions for RL remains challenging, especially in real-world tasks where sparse rewards are insufficient and dense rewards require elaborate design. In this work, we propose Autonomous Reinforcement learning for Complex Human-Informed Environments (ARCHIE), an unsupervised pipeline leveraging GPT-4, a pre-trained LLM, to generate reward functions directly from natural language task descriptions. The rewards are used to train RL agents in simulated environments, where we formalize the reward generation process to enhance feasibility. Additionally, GPT-4 automates the coding of task success criteria, creating a fully automated, one-shot procedure for translating human-readable text into deployable robot skills. Our approach is validated through extensive simulated experiments on single-arm and bi-manual manipulation tasks using an ABB YuMi collaborative robot, highlighting its practicality and effectiveness. Tasks are demonstrated on the real robot setup.
基于大语言模型的机器人操作的自主强化学习
大型语言模型(llm)和视觉语言模型(vlm)的最新进展对机器人技术产生了重大影响,使高级语义运动规划应用成为可能。强化学习(RL)是一种互补的范式,它使代理能够通过交互和奖励信号自主地优化复杂行为。然而,为强化学习设计有效的奖励函数仍然具有挑战性,特别是在现实世界的任务中,稀疏的奖励是不够的,密集的奖励需要精心设计。在这项工作中,我们提出了复杂人类信息环境的自主强化学习(ARCHIE),这是一种利用GPT-4(一个预训练的法学硕士)的无监督管道,直接从自然语言任务描述中生成奖励函数。奖励用于在模拟环境中训练强化学习代理,我们将奖励生成过程形式化以提高可行性。此外,GPT-4自动化了任务成功标准的编码,创建了一个完全自动化的一次性程序,将人类可读的文本转换为可部署的机器人技能。我们的方法通过使用ABB YuMi协作机器人进行单臂和双手操作任务的大量模拟实验得到验证,突出了其实用性和有效性。在真实的机器人设置上演示任务。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Robotics and Automation Letters
IEEE Robotics and Automation Letters Computer Science-Computer Science Applications
CiteScore
9.60
自引率
15.40%
发文量
1428
期刊介绍: The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信