Sho Takeda, Satoshi Yamamori, Satoshi Yagi, Jun Morimoto
{"title":"An empirical evaluation of a hierarchical reinforcement learning method towards modular robot control","authors":"Sho Takeda, Satoshi Yamamori, Satoshi Yagi, Jun Morimoto","doi":"10.1007/s10015-025-01003-7","DOIUrl":null,"url":null,"abstract":"<div><p>There is a growing expectation that deep reinforcement learning will enable multi-degree-of-freedom robots to acquire policies suitable for real-world applications. However, a robot system with a variety of components requires many learning trials for each different combination of robot modules. In this study, we propose a hierarchical policy design to segment tasks according to different robot components. The tasks of the multi-module robot are performed by skill sets trained on a component-by-component basis. In our learning approach, each module learns reusable skills, which are then integrated to control the whole robotic system. By adopting component-based learning and reusing previously acquired policies, we transform the action space from continuous to discrete. This transformation reduces the complexity of exploration across the entire robotic system. We validated our proposed method by applying it to a valve rotation task using a combination of a robotic arm and a robotic gripper. Evaluation based on physical simulations showed that hierarchical policy construction improved sample efficiency, achieving performance comparable to the baseline with 46.3% fewer samples.</p></div>","PeriodicalId":46050,"journal":{"name":"Artificial Life and Robotics","volume":"30 2","pages":"245 - 251"},"PeriodicalIF":0.8000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Life and Robotics","FirstCategoryId":"1085","ListUrlMain":"https://link.springer.com/article/10.1007/s10015-025-01003-7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"ROBOTICS","Score":null,"Total":0}
引用次数: 0
Abstract
There is a growing expectation that deep reinforcement learning will enable multi-degree-of-freedom robots to acquire policies suitable for real-world applications. However, a robot system with a variety of components requires many learning trials for each different combination of robot modules. In this study, we propose a hierarchical policy design to segment tasks according to different robot components. The tasks of the multi-module robot are performed by skill sets trained on a component-by-component basis. In our learning approach, each module learns reusable skills, which are then integrated to control the whole robotic system. By adopting component-based learning and reusing previously acquired policies, we transform the action space from continuous to discrete. This transformation reduces the complexity of exploration across the entire robotic system. We validated our proposed method by applying it to a valve rotation task using a combination of a robotic arm and a robotic gripper. Evaluation based on physical simulations showed that hierarchical policy construction improved sample efficiency, achieving performance comparable to the baseline with 46.3% fewer samples.