通过视觉演示强化学习实现机器人装配的一次模拟到实际转移策略

IF 1.9 4区计算机科学 Q3 ROBOTICS

Robotica Pub Date : 2024-01-24 DOI:10.1017/s0263574724000092

Ruihong Xiao, Chenguang Yang, Yiming Jiang, Hui Zhang

{"title":"通过视觉演示强化学习实现机器人装配的一次模拟到实际转移策略","authors":"Ruihong Xiao, Chenguang Yang, Yiming Jiang, Hui Zhang","doi":"10.1017/s0263574724000092","DOIUrl":null,"url":null,"abstract":"Reinforcement learning (RL) has been successfully applied to a wealth of robot manipulation tasks and continuous control problems. However, it is still limited to industrial applications and suffers from three major challenges: sample inefficiency, real data collection, and the gap between simulator and reality. In this paper, we focus on the practical application of RL for robot assembly in the real world. We apply enlightenment learning to improve the proximal policy optimization, an on-policy model-free actor-critic reinforcement learning algorithm, to train an agent in Cartesian space using the proprioceptive information. We introduce enlightenment learning incorporated via pretraining, which is beneficial to reduce the cost of policy training and improve the effectiveness of the policy. A human-like assembly trajectory is generated through a two-step method with segmenting objects by locations and iterative closest point for pretraining. We also design a sim-to-real controller to correct the error while transferring to reality. We set up the environment in the MuJoCo simulator and demonstrated the proposed method on the recently established The National Institute of Standards and Technology (NIST) gear assembly benchmark. The paper introduces a unique framework that enables a robot to learn assembly tasks efficiently using limited real-world samples by leveraging simulations and visual demonstrations. The comparative experiment results indicate that our approach surpasses other baseline methods in terms of training speed, success rate, and efficiency.","PeriodicalId":49593,"journal":{"name":"Robotica","volume":"135 1","pages":""},"PeriodicalIF":1.9000,"publicationDate":"2024-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"One-shot sim-to-real transfer policy for robotic assembly via reinforcement learning with visual demonstration\",\"authors\":\"Ruihong Xiao, Chenguang Yang, Yiming Jiang, Hui Zhang\",\"doi\":\"10.1017/s0263574724000092\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Reinforcement learning (RL) has been successfully applied to a wealth of robot manipulation tasks and continuous control problems. However, it is still limited to industrial applications and suffers from three major challenges: sample inefficiency, real data collection, and the gap between simulator and reality. In this paper, we focus on the practical application of RL for robot assembly in the real world. We apply enlightenment learning to improve the proximal policy optimization, an on-policy model-free actor-critic reinforcement learning algorithm, to train an agent in Cartesian space using the proprioceptive information. We introduce enlightenment learning incorporated via pretraining, which is beneficial to reduce the cost of policy training and improve the effectiveness of the policy. A human-like assembly trajectory is generated through a two-step method with segmenting objects by locations and iterative closest point for pretraining. We also design a sim-to-real controller to correct the error while transferring to reality. We set up the environment in the MuJoCo simulator and demonstrated the proposed method on the recently established The National Institute of Standards and Technology (NIST) gear assembly benchmark. The paper introduces a unique framework that enables a robot to learn assembly tasks efficiently using limited real-world samples by leveraging simulations and visual demonstrations. The comparative experiment results indicate that our approach surpasses other baseline methods in terms of training speed, success rate, and efficiency.\",\"PeriodicalId\":49593,\"journal\":{\"name\":\"Robotica\",\"volume\":\"135 1\",\"pages\":\"\"},\"PeriodicalIF\":1.9000,\"publicationDate\":\"2024-01-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Robotica\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1017/s0263574724000092\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"ROBOTICS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Robotica","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1017/s0263574724000092","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"ROBOTICS","Score":null,"Total":0}

引用次数: 0

摘要

强化学习（RL）已成功应用于大量机器人操纵任务和连续控制问题。然而，它仍然局限于工业应用，并面临着三大挑战：样本效率低下、真实数据收集以及模拟器与现实之间的差距。在本文中，我们将重点关注 RL 在现实世界中机器人装配中的实际应用。我们应用启蒙学习来改进近端策略优化，这是一种无策略模型的行动者批判强化学习算法，利用本体感觉信息在笛卡尔空间中训练代理。我们通过预训练引入了启蒙学习，这有利于降低策略训练的成本并提高策略的有效性。通过按位置分割物体和迭代最近点进行预训练的两步法，生成了类似人类的装配轨迹。我们还设计了一个模拟到现实的控制器，以便在传输到现实时纠正错误。我们在 MuJoCo 模拟器中建立了环境，并在最近建立的美国国家标准与技术研究院（NIST）齿轮装配基准上演示了所提出的方法。本文介绍了一个独特的框架，通过利用模拟和视觉演示，使机器人能够利用有限的现实世界样本高效地学习装配任务。对比实验结果表明，我们的方法在训练速度、成功率和效率方面都超过了其他基准方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

One-shot sim-to-real transfer policy for robotic assembly via reinforcement learning with visual demonstration

Reinforcement learning (RL) has been successfully applied to a wealth of robot manipulation tasks and continuous control problems. However, it is still limited to industrial applications and suffers from three major challenges: sample inefficiency, real data collection, and the gap between simulator and reality. In this paper, we focus on the practical application of RL for robot assembly in the real world. We apply enlightenment learning to improve the proximal policy optimization, an on-policy model-free actor-critic reinforcement learning algorithm, to train an agent in Cartesian space using the proprioceptive information. We introduce enlightenment learning incorporated via pretraining, which is beneficial to reduce the cost of policy training and improve the effectiveness of the policy. A human-like assembly trajectory is generated through a two-step method with segmenting objects by locations and iterative closest point for pretraining. We also design a sim-to-real controller to correct the error while transferring to reality. We set up the environment in the MuJoCo simulator and demonstrated the proposed method on the recently established The National Institute of Standards and Technology (NIST) gear assembly benchmark. The paper introduces a unique framework that enables a robot to learn assembly tasks efficiently using limited real-world samples by leveraging simulations and visual demonstrations. The comparative experiment results indicate that our approach surpasses other baseline methods in terms of training speed, success rate, and efficiency.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Robotica 工程技术-机器人学

CiteScore

4.50

自引率

22.20%

发文量

181

审稿时长

9.9 months

期刊介绍： Robotica is a forum for the multidisciplinary subject of robotics and encourages developments, applications and research in this important field of automation and robotics with regard to industry, health, education and economic and social aspects of relevance. Coverage includes activities in hostile environments, applications in the service and manufacturing industries, biological robotics, dynamics and kinematics involved in robot design and uses, on-line robots, robot task planning, rehabilitation robotics, sensory perception, software in the widest sense, particularly in respect of programming languages and links with CAD/CAM systems, telerobotics and various other areas. In addition, interest is focused on various Artificial Intelligence topics of theoretical and practical interest.