{"title":"State space construction for behavior acquisition in multi agent environments with vision and action","authors":"E. Uchibe, M. Asada, K. Hosoda","doi":"10.1109/ICCV.1998.710819","DOIUrl":null,"url":null,"abstract":"This paper proposes a method which estimates the relationships between learner's behaviors and other agents' ones in the environment through interactions (observation and action) using the method of system identification. In order to identify the model of each agent, Akaike's Information Criterion is applied to the results of Canonical Variate Analysis for the relationship between the observed data in terms of action and future observation. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior. The proposed method is applied to a soccer playing situation, where a rolling ball and other moving agents are well modeled and the learner's behaviors are successfully acquired by the method. Computer simulations and real experiments are shown and a discussion is given.","PeriodicalId":270671,"journal":{"name":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","volume":"136 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1998-01-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"25","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCV.1998.710819","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 25
Abstract
This paper proposes a method which estimates the relationships between learner's behaviors and other agents' ones in the environment through interactions (observation and action) using the method of system identification. In order to identify the model of each agent, Akaike's Information Criterion is applied to the results of Canonical Variate Analysis for the relationship between the observed data in terms of action and future observation. Next, reinforcement learning based on the estimated state vectors is performed to obtain the optimal behavior. The proposed method is applied to a soccer playing situation, where a rolling ball and other moving agents are well modeled and the learner's behaviors are successfully acquired by the method. Computer simulations and real experiments are shown and a discussion is given.