{"title":"Reward-driven learning of sensorimotor laws and visual features","authors":"Jens Kleesiek, A. Engel, C. Weber, S. Wermter","doi":"10.1109/DEVLRN.2011.6037358","DOIUrl":null,"url":null,"abstract":"A frequently reoccurring task of humanoid robots is the autonomous navigation towards a goal position. Here we present a simulation of a purely vision-based docking behavior in a 3-D physical world. The robot learns sensorimotor laws and visual features simultaneously and exploits both for navigation towards its virtual target region. The control laws are trained using a two-layer network consisting of a feature (sensory) layer that feeds into an action (Q-value) layer. A reinforcement feedback signal (delta) modulates not only the action but at the same time the feature weights. Under this influence, the network learns interpretable visual features and assigns goal-directed actions successfully. This is a step towards investigating how reinforcement learning can be linked to visual perception.","PeriodicalId":256921,"journal":{"name":"2011 IEEE International Conference on Development and Learning (ICDL)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE International Conference on Development and Learning (ICDL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DEVLRN.2011.6037358","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
A frequently reoccurring task of humanoid robots is the autonomous navigation towards a goal position. Here we present a simulation of a purely vision-based docking behavior in a 3-D physical world. The robot learns sensorimotor laws and visual features simultaneously and exploits both for navigation towards its virtual target region. The control laws are trained using a two-layer network consisting of a feature (sensory) layer that feeds into an action (Q-value) layer. A reinforcement feedback signal (delta) modulates not only the action but at the same time the feature weights. Under this influence, the network learns interpretable visual features and assigns goal-directed actions successfully. This is a step towards investigating how reinforcement learning can be linked to visual perception.