Alap Kshirsagar, Tair Faibish, G. Hoffman, A. Biess
{"title":"Lessons Learned from Utilizing Guided Policy Search for Human-Robot Handovers with a Collaborative Robot","authors":"Alap Kshirsagar, Tair Faibish, G. Hoffman, A. Biess","doi":"10.1109/RAAI56146.2022.10092989","DOIUrl":null,"url":null,"abstract":"We evaluate the performance of Guided Policy Search (GPS), a model-based reinforcement learning method, for generating the handover reaching motions of a collaborative robot arm. In a previous work, we evaluated GPS for the same task but only in a simulated environment. This paper provides a replication of the findings in simulation, along with new insights on GPS when used on a physical robot platform. First, we find that a policy learned in simulation does not transfer readily to the physical robot due to differences in model parameters and existing safety constraints on the real robot. Second, in order to successfully train a GPS model, the robot’s workspace needs to be severely reduced, owing to the joint-space limitations of the physical robot. Third, a policy trained with moving targets results in large worst-case errors even in regions spatially close to the training target locations. Our findings motivate further research towards utilizing GPS in humanrobot interaction settings, especially where safety constraints are imposed.","PeriodicalId":190255,"journal":{"name":"2022 2nd International Conference on Robotics, Automation and Artificial Intelligence (RAAI)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 2nd International Conference on Robotics, Automation and Artificial Intelligence (RAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/RAAI56146.2022.10092989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
We evaluate the performance of Guided Policy Search (GPS), a model-based reinforcement learning method, for generating the handover reaching motions of a collaborative robot arm. In a previous work, we evaluated GPS for the same task but only in a simulated environment. This paper provides a replication of the findings in simulation, along with new insights on GPS when used on a physical robot platform. First, we find that a policy learned in simulation does not transfer readily to the physical robot due to differences in model parameters and existing safety constraints on the real robot. Second, in order to successfully train a GPS model, the robot’s workspace needs to be severely reduced, owing to the joint-space limitations of the physical robot. Third, a policy trained with moving targets results in large worst-case errors even in regions spatially close to the training target locations. Our findings motivate further research towards utilizing GPS in humanrobot interaction settings, especially where safety constraints are imposed.