通过人在环强化学习的精确和灵巧的机器人操作

IF 27.5 1区计算机科学 Q1 ROBOTICS

Science Robotics Pub Date : 2025-08-20 DOI:10.1126/scirobotics.ads5033

Jianlan Luo, Charles Xu, Jeffrey Wu, Sergey Levine

{"title":"通过人在环强化学习的精确和灵巧的机器人操作","authors":"Jianlan Luo, Charles Xu, Jeffrey Wu, Sergey Levine","doi":"10.1126/scirobotics.ads5033","DOIUrl":null,"url":null,"abstract":"<div >Robotic manipulation remains one of the most difficult challenges in robotics, with approaches ranging from classical model-based control to modern imitation learning. Although these methods have enabled substantial progress, they often require extensive manual design, struggle with performance, and demand large-scale data collection. These limitations hinder their real-world deployment at scale, where reliability, speed, and robustness are essential. Reinforcement learning (RL) offers a powerful alternative by enabling robots to autonomously acquire complex manipulation skills through interaction. However, realizing the full potential of RL in the real world remains challenging because of issues of sample efficiency and safety. We present a human-in-the-loop, vision-based RL system that achieved strong performance on a wide range of dexterous manipulation tasks, including precise assembly, dynamic manipulation, and dual-arm coordination. These tasks reflect realistic industrial tolerances, with small but critical variations in initial object placements that demand sophisticated reactive control. Our method integrates demonstrations, human corrections, sample-efficient RL algorithms, and system-level design to directly learn RL policies in the real world. Within 1 to 2.5 hours of real-world training, our approach outperformed other baselines by improving task success by 2×, achieving near-perfect success rates, and executing 1.8× faster on average. Through extensive experiments and analysis, our results suggest that RL can learn a wide range of complex vision-based manipulation policies directly in the real world within practical training times. We hope that this work will inspire a new generation of learned robotic manipulation techniques, benefiting both industrial applications and research advancements.</div>","PeriodicalId":56029,"journal":{"name":"Science Robotics","volume":"10 105","pages":""},"PeriodicalIF":27.5000,"publicationDate":"2025-08-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Precise and dexterous robotic manipulation via human-in-the-loop reinforcement learning\",\"authors\":\"Jianlan Luo, Charles Xu, Jeffrey Wu, Sergey Levine\",\"doi\":\"10.1126/scirobotics.ads5033\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div >Robotic manipulation remains one of the most difficult challenges in robotics, with approaches ranging from classical model-based control to modern imitation learning. Although these methods have enabled substantial progress, they often require extensive manual design, struggle with performance, and demand large-scale data collection. These limitations hinder their real-world deployment at scale, where reliability, speed, and robustness are essential. Reinforcement learning (RL) offers a powerful alternative by enabling robots to autonomously acquire complex manipulation skills through interaction. However, realizing the full potential of RL in the real world remains challenging because of issues of sample efficiency and safety. We present a human-in-the-loop, vision-based RL system that achieved strong performance on a wide range of dexterous manipulation tasks, including precise assembly, dynamic manipulation, and dual-arm coordination. These tasks reflect realistic industrial tolerances, with small but critical variations in initial object placements that demand sophisticated reactive control. Our method integrates demonstrations, human corrections, sample-efficient RL algorithms, and system-level design to directly learn RL policies in the real world. Within 1 to 2.5 hours of real-world training, our approach outperformed other baselines by improving task success by 2×, achieving near-perfect success rates, and executing 1.8× faster on average. Through extensive experiments and analysis, our results suggest that RL can learn a wide range of complex vision-based manipulation policies directly in the real world within practical training times. We hope that this work will inspire a new generation of learned robotic manipulation techniques, benefiting both industrial applications and research advancements.</div>\",\"PeriodicalId\":56029,\"journal\":{\"name\":\"Science Robotics\",\"volume\":\"10 105\",\"pages\":\"\"},\"PeriodicalIF\":27.5000,\"publicationDate\":\"2025-08-20\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Science Robotics\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.science.org/doi/10.1126/scirobotics.ads5033\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Science Robotics","FirstCategoryId":"94","ListUrlMain":"https://www.science.org/doi/10.1126/scirobotics.ads5033","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

机器人操作仍然是机器人技术中最困难的挑战之一，其方法从经典的基于模型的控制到现代模仿学习。尽管这些方法已经取得了实质性的进展，但它们通常需要大量的手工设计，在性能方面存在问题，并且需要大规模的数据收集。这些限制阻碍了它们在现实世界中的大规模部署，而可靠性、速度和健壮性是必不可少的。强化学习（RL）提供了一个强大的替代方案，使机器人能够通过交互自主地获得复杂的操作技能。然而，由于样本效率和安全性问题，在现实世界中实现强化学习的全部潜力仍然具有挑战性。我们提出了一个基于视觉的人在环强化学习系统，该系统在广泛的灵巧操作任务中取得了强大的性能，包括精确装配、动态操作和双臂协调。这些任务反映了现实的工业公差，初始物体放置位置的微小但关键的变化需要复杂的反应控制。我们的方法集成了演示、人工校正、样本高效强化学习算法和系统级设计，以直接学习现实世界中的强化学习策略。在1到2.5小时的真实世界训练中，我们的方法通过将任务成功率提高2倍，实现近乎完美的成功率，平均执行速度提高1.8倍，优于其他基准。通过大量的实验和分析，我们的研究结果表明，强化学习可以在实际训练时间内直接在现实世界中学习各种复杂的基于视觉的操作策略。我们希望这项工作将激发新一代的学习机器人操作技术，有利于工业应用和研究进展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Precise and dexterous robotic manipulation via human-in-the-loop reinforcement learning

Robotic manipulation remains one of the most difficult challenges in robotics, with approaches ranging from classical model-based control to modern imitation learning. Although these methods have enabled substantial progress, they often require extensive manual design, struggle with performance, and demand large-scale data collection. These limitations hinder their real-world deployment at scale, where reliability, speed, and robustness are essential. Reinforcement learning (RL) offers a powerful alternative by enabling robots to autonomously acquire complex manipulation skills through interaction. However, realizing the full potential of RL in the real world remains challenging because of issues of sample efficiency and safety. We present a human-in-the-loop, vision-based RL system that achieved strong performance on a wide range of dexterous manipulation tasks, including precise assembly, dynamic manipulation, and dual-arm coordination. These tasks reflect realistic industrial tolerances, with small but critical variations in initial object placements that demand sophisticated reactive control. Our method integrates demonstrations, human corrections, sample-efficient RL algorithms, and system-level design to directly learn RL policies in the real world. Within 1 to 2.5 hours of real-world training, our approach outperformed other baselines by improving task success by 2×, achieving near-perfect success rates, and executing 1.8× faster on average. Through extensive experiments and analysis, our results suggest that RL can learn a wide range of complex vision-based manipulation policies directly in the real world within practical training times. We hope that this work will inspire a new generation of learned robotic manipulation techniques, benefiting both industrial applications and research advancements.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Science Robotics Mathematics-Control and Optimization

CiteScore

30.60

自引率

2.80%

发文量

期刊介绍： Science Robotics publishes original, peer-reviewed, science- or engineering-based research articles that advance the field of robotics. The journal also features editor-commissioned Reviews. An international team of academic editors holds Science Robotics articles to the same high-quality standard that is the hallmark of the Science family of journals. Sub-topics include: actuators, advanced materials, artificial Intelligence, autonomous vehicles, bio-inspired design, exoskeletons, fabrication, field robotics, human-robot interaction, humanoids, industrial robotics, kinematics, machine learning, material science, medical technology, motion planning and control, micro- and nano-robotics, multi-robot control, sensors, service robotics, social and ethical issues, soft robotics, and space, planetary and undersea exploration.