基于达文波特链式旋转的手持机器人操作层次强化学习

Francisco Roldan Sanchez, Qiang Wang, David Córdova Bulens, Kevin McGuinness, Stephen Redmond, Noel E. O'Connor
{"title":"基于达文波特链式旋转的手持机器人操作层次强化学习","authors":"Francisco Roldan Sanchez, Qiang Wang, David Córdova Bulens, Kevin McGuinness, Stephen Redmond, Noel E. O'Connor","doi":"10.1109/ICARA56516.2023.10125281","DOIUrl":null,"url":null,"abstract":"End-to-end reinforcement learning techniques are among the most successful methods for robotic manipulation tasks. However, the training time required to find a good policy capable of solving complex tasks is prohibitively large. Therefore, depending on the computing resources available, it might not be feasible to use such techniques. The use of domain knowledge to decompose manipulation tasks into primitive skills, to be performed in sequence, could reduce the overall complexity of the learning problem, and hence reduce the amount of training required to achieve dexterity. In this paper, we propose the use of Davenport chained rotations to decompose complex 3D rotation goals into a concatenation of a smaller set of more simple rotation skills. State-of-the-art reinforcement-learning-based methods can then be trained using less overall simulated experience. We compare this learning approach with the popular Hindsight Experience Replay method, trained in an end-to-end fashion using the same amount of experience in a simulated robotic hand environment. Despite a general decrease in performance of the primitive skills when being sequentially executed, we find that decomposing arbitrary 3D rotations into elementary rotations is beneficial when computing resources are limited, obtaining increases of success rates of approximately 10% on the most complex 3D rotations with respect to the success rates obtained by a HER-based approach trained in an end-to-end fashion, and increases of success rates between 20% and 40% on the most simple rotations.","PeriodicalId":443572,"journal":{"name":"2023 9th International Conference on Automation, Robotics and Applications (ICARA)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Hierarchical Reinforcement Learning for In-hand Robotic Manipulation Using Davenport Chained Rotations\",\"authors\":\"Francisco Roldan Sanchez, Qiang Wang, David Córdova Bulens, Kevin McGuinness, Stephen Redmond, Noel E. O'Connor\",\"doi\":\"10.1109/ICARA56516.2023.10125281\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"End-to-end reinforcement learning techniques are among the most successful methods for robotic manipulation tasks. However, the training time required to find a good policy capable of solving complex tasks is prohibitively large. Therefore, depending on the computing resources available, it might not be feasible to use such techniques. The use of domain knowledge to decompose manipulation tasks into primitive skills, to be performed in sequence, could reduce the overall complexity of the learning problem, and hence reduce the amount of training required to achieve dexterity. In this paper, we propose the use of Davenport chained rotations to decompose complex 3D rotation goals into a concatenation of a smaller set of more simple rotation skills. State-of-the-art reinforcement-learning-based methods can then be trained using less overall simulated experience. We compare this learning approach with the popular Hindsight Experience Replay method, trained in an end-to-end fashion using the same amount of experience in a simulated robotic hand environment. Despite a general decrease in performance of the primitive skills when being sequentially executed, we find that decomposing arbitrary 3D rotations into elementary rotations is beneficial when computing resources are limited, obtaining increases of success rates of approximately 10% on the most complex 3D rotations with respect to the success rates obtained by a HER-based approach trained in an end-to-end fashion, and increases of success rates between 20% and 40% on the most simple rotations.\",\"PeriodicalId\":443572,\"journal\":{\"name\":\"2023 9th International Conference on Automation, Robotics and Applications (ICARA)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-10-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2023 9th International Conference on Automation, Robotics and Applications (ICARA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICARA56516.2023.10125281\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 9th International Conference on Automation, Robotics and Applications (ICARA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICARA56516.2023.10125281","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

端到端强化学习技术是机器人操作任务中最成功的方法之一。然而,找到一个能够解决复杂任务的好策略所需的培训时间非常长。因此,根据可用的计算资源,使用这种技术可能不可行。使用领域知识将操作任务分解为原始技能,并按顺序执行,可以降低学习问题的总体复杂性,从而减少实现灵巧性所需的训练量。在本文中,我们提出使用达文波特链式旋转将复杂的3D旋转目标分解为更小的一组更简单的旋转技能的串联。最先进的基于强化学习的方法可以使用较少的整体模拟经验进行训练。我们将这种学习方法与流行的后见之明经验重放方法进行比较,后见之明经验重放方法是在模拟机械手环境中使用相同数量的经验以端到端方式进行训练。尽管顺序执行时基本技能的性能普遍下降,但我们发现,在计算资源有限的情况下,将任意3D旋转分解为基本旋转是有益的,在最复杂的3D旋转中,相对于以端到端方式训练的基于herp的方法获得的成功率,成功率增加了约10%,在最简单的旋转中成功率增加了20%至40%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Hierarchical Reinforcement Learning for In-hand Robotic Manipulation Using Davenport Chained Rotations
End-to-end reinforcement learning techniques are among the most successful methods for robotic manipulation tasks. However, the training time required to find a good policy capable of solving complex tasks is prohibitively large. Therefore, depending on the computing resources available, it might not be feasible to use such techniques. The use of domain knowledge to decompose manipulation tasks into primitive skills, to be performed in sequence, could reduce the overall complexity of the learning problem, and hence reduce the amount of training required to achieve dexterity. In this paper, we propose the use of Davenport chained rotations to decompose complex 3D rotation goals into a concatenation of a smaller set of more simple rotation skills. State-of-the-art reinforcement-learning-based methods can then be trained using less overall simulated experience. We compare this learning approach with the popular Hindsight Experience Replay method, trained in an end-to-end fashion using the same amount of experience in a simulated robotic hand environment. Despite a general decrease in performance of the primitive skills when being sequentially executed, we find that decomposing arbitrary 3D rotations into elementary rotations is beneficial when computing resources are limited, obtaining increases of success rates of approximately 10% on the most complex 3D rotations with respect to the success rates obtained by a HER-based approach trained in an end-to-end fashion, and increases of success rates between 20% and 40% on the most simple rotations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信