{"title":"Reinforcement learning of sensor-based reaching strategies for a two-link manipulator","authors":"P. Martín, J. Millán","doi":"10.1109/IROS.1996.568991","DOIUrl":null,"url":null,"abstract":"This paper presents a neural controller that learns goal-oriented obstacle-avoiding reaction strategies for a multilink robot arm. It acquires these strategies through reinforcement learning from local sensory data. The robot arm has rings of range sensors placed along its links. The neural controller achieves a good performance quite rapidly and shows good generalization abilities in the face of new environments. Suitable input and output codification schemes help greatly to attain these aims. The input codification exploits the inherent symmetry of the robot kinematics and the action given by the controller is interpreted with regard to the shortest path vector (SPV) to the closest goal in the configuration space. In order to avoid the SPV computation for multilink manipulators, we put forward the use of a module for differential inverse kinematics based on the inversion of a neural network that has been previously trained to approximate the manipulator forward kinematics. The use of this module does not only get round the SPV calculation, but also speeds up the learning process.","PeriodicalId":374871,"journal":{"name":"Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1996-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of IEEE/RSJ International Conference on Intelligent Robots and Systems. IROS '96","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IROS.1996.568991","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
This paper presents a neural controller that learns goal-oriented obstacle-avoiding reaction strategies for a multilink robot arm. It acquires these strategies through reinforcement learning from local sensory data. The robot arm has rings of range sensors placed along its links. The neural controller achieves a good performance quite rapidly and shows good generalization abilities in the face of new environments. Suitable input and output codification schemes help greatly to attain these aims. The input codification exploits the inherent symmetry of the robot kinematics and the action given by the controller is interpreted with regard to the shortest path vector (SPV) to the closest goal in the configuration space. In order to avoid the SPV computation for multilink manipulators, we put forward the use of a module for differential inverse kinematics based on the inversion of a neural network that has been previously trained to approximate the manipulator forward kinematics. The use of this module does not only get round the SPV calculation, but also speeds up the learning process.