Towards Autonomous Reinforcement Learning for Real-World Robotic Manipulation With Large Language Models

IF 4.6 2区计算机科学 Q2 ROBOTICS

IEEE Robotics and Automation Letters Pub Date : 2025-07-14 DOI:10.1109/LRA.2025.3589162

Niccolò Turcato;Matteo Iovino;Aris Synodinos;Alberto Dalla Libera;Ruggero Carli;Pietro Falco

{"title":"Towards Autonomous Reinforcement Learning for Real-World Robotic Manipulation With Large Language Models","authors":"Niccolò Turcato;Matteo Iovino;Aris Synodinos;Alberto Dalla Libera;Ruggero Carli;Pietro Falco","doi":"10.1109/LRA.2025.3589162","DOIUrl":null,"url":null,"abstract":"Recent advancements in Large Language Models (LLMs) and Visual Language Models (VLMs) have significantly impacted robotics, enabling high-level semantic motion planning applications. Reinforcement Learning (RL), a complementary paradigm, enables agents to autonomously optimize complex behaviors through interaction and reward signals. However, designing effective reward functions for RL remains challenging, especially in real-world tasks where sparse rewards are insufficient and dense rewards require elaborate design. In this work, we propose <italic>Autonomous Reinforcement learning for Complex Human-Informed Environments</i> (<italic>ARCHIE</i>), an unsupervised pipeline leveraging GPT-4, a pre-trained LLM, to generate reward functions directly from natural language task descriptions. The rewards are used to train RL agents in simulated environments, where we formalize the reward generation process to enhance feasibility. Additionally, GPT-4 automates the coding of task success criteria, creating a fully automated, one-shot procedure for translating human-readable text into deployable robot skills. Our approach is validated through extensive simulated experiments on single-arm and bi-manual manipulation tasks using an ABB YuMi collaborative robot, highlighting its practicality and effectiveness. Tasks are demonstrated on the real robot setup.","PeriodicalId":13241,"journal":{"name":"IEEE Robotics and Automation Letters","volume":"10 9","pages":"8850-8857"},"PeriodicalIF":4.6000,"publicationDate":"2025-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Robotics and Automation Letters","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11080043/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Recent advancements in Large Language Models (LLMs) and Visual Language Models (VLMs) have significantly impacted robotics, enabling high-level semantic motion planning applications. Reinforcement Learning (RL), a complementary paradigm, enables agents to autonomously optimize complex behaviors through interaction and reward signals. However, designing effective reward functions for RL remains challenging, especially in real-world tasks where sparse rewards are insufficient and dense rewards require elaborate design. In this work, we propose Autonomous Reinforcement learning for Complex Human-Informed Environments (ARCHIE), an unsupervised pipeline leveraging GPT-4, a pre-trained LLM, to generate reward functions directly from natural language task descriptions. The rewards are used to train RL agents in simulated environments, where we formalize the reward generation process to enhance feasibility. Additionally, GPT-4 automates the coding of task success criteria, creating a fully automated, one-shot procedure for translating human-readable text into deployable robot skills. Our approach is validated through extensive simulated experiments on single-arm and bi-manual manipulation tasks using an ABB YuMi collaborative robot, highlighting its practicality and effectiveness. Tasks are demonstrated on the real robot setup.

查看原文本刊更多论文

基于大语言模型的机器人操作的自主强化学习

大型语言模型（llm）和视觉语言模型（vlm）的最新进展对机器人技术产生了重大影响，使高级语义运动规划应用成为可能。强化学习（RL）是一种互补的范式，它使代理能够通过交互和奖励信号自主地优化复杂行为。然而，为强化学习设计有效的奖励函数仍然具有挑战性，特别是在现实世界的任务中，稀疏的奖励是不够的，密集的奖励需要精心设计。在这项工作中，我们提出了复杂人类信息环境的自主强化学习（ARCHIE），这是一种利用GPT-4（一个预训练的法学硕士）的无监督管道，直接从自然语言任务描述中生成奖励函数。奖励用于在模拟环境中训练强化学习代理，我们将奖励生成过程形式化以提高可行性。此外，GPT-4自动化了任务成功标准的编码，创建了一个完全自动化的一次性程序，将人类可读的文本转换为可部署的机器人技能。我们的方法通过使用ABB YuMi协作机器人进行单臂和双手操作任务的大量模拟实验得到验证，突出了其实用性和有效性。在真实的机器人设置上演示任务。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Robotics and Automation Letters Computer Science-Computer Science Applications

CiteScore

9.60

自引率

15.40%

发文量

1428

期刊介绍： The scope of this journal is to publish peer-reviewed articles that provide a timely and concise account of innovative research ideas and application results, reporting significant theoretical findings and application case studies in areas of robotics and automation.