基于达文波特链式旋转的手持机器人操作层次强化学习

2023 9th International Conference on Automation, Robotics and Applications (ICARA) Pub Date : 2022-10-03 DOI:10.1109/ICARA56516.2023.10125281

Francisco Roldan Sanchez, Qiang Wang, David Córdova Bulens, Kevin McGuinness, Stephen Redmond, Noel E. O'Connor

{"title":"基于达文波特链式旋转的手持机器人操作层次强化学习","authors":"Francisco Roldan Sanchez, Qiang Wang, David Córdova Bulens, Kevin McGuinness, Stephen Redmond, Noel E. O'Connor","doi":"10.1109/ICARA56516.2023.10125281","DOIUrl":null,"url":null,"abstract":"End-to-end reinforcement learning techniques are among the most successful methods for robotic manipulation tasks. However, the training time required to find a good policy capable of solving complex tasks is prohibitively large. Therefore, depending on the computing resources available, it might not be feasible to use such techniques. The use of domain knowledge to decompose manipulation tasks into primitive skills, to be performed in sequence, could reduce the overall complexity of the learning problem, and hence reduce the amount of training required to achieve dexterity. In this paper, we propose the use of Davenport chained rotations to decompose complex 3D rotation goals into a concatenation of a smaller set of more simple rotation skills. State-of-the-art reinforcement-learning-based methods can then be trained using less overall simulated experience. We compare this learning approach with the popular Hindsight Experience Replay method, trained in an end-to-end fashion using the same amount of experience in a simulated robotic hand environment. Despite a general decrease in performance of the primitive skills when being sequentially executed, we find that decomposing arbitrary 3D rotations into elementary rotations is beneficial when computing resources are limited, obtaining increases of success rates of approximately 10% on the most complex 3D rotations with respect to the success rates obtained by a HER-based approach trained in an end-to-end fashion, and increases of success rates between 20% and 40% on the most simple rotations.","PeriodicalId":443572,"journal":{"name":"2023 9th International Conference on Automation, Robotics and Applications (ICARA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hierarchical Reinforcement Learning for In-hand Robotic Manipulation Using Davenport Chained Rotations\",\"authors\":\"Francisco Roldan Sanchez, Qiang Wang, David Córdova Bulens, Kevin McGuinness, Stephen Redmond, Noel E. O'Connor\",\"doi\":\"10.1109/ICARA56516.2023.10125281\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"End-to-end reinforcement learning techniques are among the most successful methods for robotic manipulation tasks. However, the training time required to find a good policy capable of solving complex tasks is prohibitively large. Therefore, depending on the computing resources available, it might not be feasible to use such techniques. The use of domain knowledge to decompose manipulation tasks into primitive skills, to be performed in sequence, could reduce the overall complexity of the learning problem, and hence reduce the amount of training required to achieve dexterity. In this paper, we propose the use of Davenport chained rotations to decompose complex 3D rotation goals into a concatenation of a smaller set of more simple rotation skills. State-of-the-art reinforcement-learning-based methods can then be trained using less overall simulated experience. We compare this learning approach with the popular Hindsight Experience Replay method, trained in an end-to-end fashion using the same amount of experience in a simulated robotic hand environment. Despite a general decrease in performance of the primitive skills when being sequentially executed, we find that decomposing arbitrary 3D rotations into elementary rotations is beneficial when computing resources are limited, obtaining increases of success rates of approximately 10% on the most complex 3D rotations with respect to the success rates obtained by a HER-based approach trained in an end-to-end fashion, and increases of success rates between 20% and 40% on the most simple rotations.\",\"PeriodicalId\":443572,\"journal\":{\"name\":\"2023 9th International Conference on Automation, Robotics and Applications (ICARA)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 9th International Conference on Automation, Robotics and Applications (ICARA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICARA56516.2023.10125281\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 9th International Conference on Automation, Robotics and Applications (ICARA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARA56516.2023.10125281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

端到端强化学习技术是机器人操作任务中最成功的方法之一。然而，找到一个能够解决复杂任务的好策略所需的培训时间非常长。因此，根据可用的计算资源，使用这种技术可能不可行。使用领域知识将操作任务分解为原始技能，并按顺序执行，可以降低学习问题的总体复杂性，从而减少实现灵巧性所需的训练量。在本文中，我们提出使用达文波特链式旋转将复杂的3D旋转目标分解为更小的一组更简单的旋转技能的串联。最先进的基于强化学习的方法可以使用较少的整体模拟经验进行训练。我们将这种学习方法与流行的后见之明经验重放方法进行比较，后见之明经验重放方法是在模拟机械手环境中使用相同数量的经验以端到端方式进行训练。尽管顺序执行时基本技能的性能普遍下降，但我们发现，在计算资源有限的情况下，将任意3D旋转分解为基本旋转是有益的，在最复杂的3D旋转中，相对于以端到端方式训练的基于herp的方法获得的成功率，成功率增加了约10%，在最简单的旋转中成功率增加了20%至40%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Hierarchical Reinforcement Learning for In-hand Robotic Manipulation Using Davenport Chained Rotations

End-to-end reinforcement learning techniques are among the most successful methods for robotic manipulation tasks. However, the training time required to find a good policy capable of solving complex tasks is prohibitively large. Therefore, depending on the computing resources available, it might not be feasible to use such techniques. The use of domain knowledge to decompose manipulation tasks into primitive skills, to be performed in sequence, could reduce the overall complexity of the learning problem, and hence reduce the amount of training required to achieve dexterity. In this paper, we propose the use of Davenport chained rotations to decompose complex 3D rotation goals into a concatenation of a smaller set of more simple rotation skills. State-of-the-art reinforcement-learning-based methods can then be trained using less overall simulated experience. We compare this learning approach with the popular Hindsight Experience Replay method, trained in an end-to-end fashion using the same amount of experience in a simulated robotic hand environment. Despite a general decrease in performance of the primitive skills when being sequentially executed, we find that decomposing arbitrary 3D rotations into elementary rotations is beneficial when computing resources are limited, obtaining increases of success rates of approximately 10% on the most complex 3D rotations with respect to the success rates obtained by a HER-based approach trained in an end-to-end fashion, and increases of success rates between 20% and 40% on the most simple rotations.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2023 9th International Conference on Automation, Robotics and Applications (ICARA)

自引率

0.00%

发文量