{"title":"TUCA-HER: An Improved HER for Robot Manipulation Skill Learning via Trajectory Utility and Conservative Advantage","authors":"Peiliang Wu;Zhaoqi Wang;Yao Li;Wenbai Chen;Guowei Gao","doi":"10.1109/TETCI.2025.3548787","DOIUrl":null,"url":null,"abstract":"In the realm of multi-goal reinforcement learning for robot manipulation, effectively addressing sparse rewards has been a key challenge. The hindsight experience replay (HER) mechanism has provided notable advancements in this domain, yet its efficiency and adaptability still require further improvement. This paper introduces TUCA-HER for robot manipulation skill learning via Trajectory Utility and Conservative Advantage. We start by computing trajectory utility for experience samples collected in the early stages of training, which allows for dynamic relabeling and significantly enhances sample efficiency. Furthermore, we integrate conservative advantage learning into the actor-critic framework, reshaping rewards to construct TUCA-HER. Finally, we apply TUCA-HER to robot manipulation skill learning tasks, providing details on algorithmic implementation and complexity analysis. Evaluations conducted on OpenAI Fetch and Hand environments demonstrate TUCA-HER's superior performance in sample efficiency and task success rate compared to other algorithms. Notably, in the FetchPickAndPlace task, TUCA-HER showcases a remarkable 46% improvement over the Double experience replay buffer Adaptive Soft Hindsight Experience Replay (DAS-HER). Furthermore, Sim-to-Real experiments are conducted to validate the effectiveness of TUCA-HER in real-world environments.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":"9 5","pages":"3560-3571"},"PeriodicalIF":5.3000,"publicationDate":"2025-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Emerging Topics in Computational Intelligence","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10930738/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
In the realm of multi-goal reinforcement learning for robot manipulation, effectively addressing sparse rewards has been a key challenge. The hindsight experience replay (HER) mechanism has provided notable advancements in this domain, yet its efficiency and adaptability still require further improvement. This paper introduces TUCA-HER for robot manipulation skill learning via Trajectory Utility and Conservative Advantage. We start by computing trajectory utility for experience samples collected in the early stages of training, which allows for dynamic relabeling and significantly enhances sample efficiency. Furthermore, we integrate conservative advantage learning into the actor-critic framework, reshaping rewards to construct TUCA-HER. Finally, we apply TUCA-HER to robot manipulation skill learning tasks, providing details on algorithmic implementation and complexity analysis. Evaluations conducted on OpenAI Fetch and Hand environments demonstrate TUCA-HER's superior performance in sample efficiency and task success rate compared to other algorithms. Notably, in the FetchPickAndPlace task, TUCA-HER showcases a remarkable 46% improvement over the Double experience replay buffer Adaptive Soft Hindsight Experience Replay (DAS-HER). Furthermore, Sim-to-Real experiments are conducted to validate the effectiveness of TUCA-HER in real-world environments.
期刊介绍:
The IEEE Transactions on Emerging Topics in Computational Intelligence (TETCI) publishes original articles on emerging aspects of computational intelligence, including theory, applications, and surveys.
TETCI is an electronics only publication. TETCI publishes six issues per year.
Authors are encouraged to submit manuscripts in any emerging topic in computational intelligence, especially nature-inspired computing topics not covered by other IEEE Computational Intelligence Society journals. A few such illustrative examples are glial cell networks, computational neuroscience, Brain Computer Interface, ambient intelligence, non-fuzzy computing with words, artificial life, cultural learning, artificial endocrine networks, social reasoning, artificial hormone networks, computational intelligence for the IoT and Smart-X technologies.