Yang Yu, Zhixiong Gan, Chun Xing Li, Hui Luo, Jiashou Wang
{"title":"Continuous Adaptation in Nonstationary Environments Based on Actor-Critic Algorithm","authors":"Yang Yu, Zhixiong Gan, Chun Xing Li, Hui Luo, Jiashou Wang","doi":"10.1109/ISPACS57703.2022.10082809","DOIUrl":null,"url":null,"abstract":"In reinforcement learning, the training process for the agent is highly relevant to the dynamics, Agent's dynamics are generally considered to be parts of environments. When dynamics changed, the previous learning model may be unable to adapt to the new environment. In this paper, we propose a simple adaptive method based on the traditional actor-critic framework. A new component named Adaptor is added to the original model. The kernel of the Adaptor is a network which has the same structure as the Critic. The component can adaptively adjust the Actor's actions. Experiments show the agents pre-trained in different environments including Gym and MuJoCo achieve better performances in the tasks of adapting to the new dynamics-changed environments than the original methods. Moreover, the proposed method shows superior performance over the baseline method just learning form the scratch in some original tasks.","PeriodicalId":410603,"journal":{"name":"2022 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPACS57703.2022.10082809","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In reinforcement learning, the training process for the agent is highly relevant to the dynamics, Agent's dynamics are generally considered to be parts of environments. When dynamics changed, the previous learning model may be unable to adapt to the new environment. In this paper, we propose a simple adaptive method based on the traditional actor-critic framework. A new component named Adaptor is added to the original model. The kernel of the Adaptor is a network which has the same structure as the Critic. The component can adaptively adjust the Actor's actions. Experiments show the agents pre-trained in different environments including Gym and MuJoCo achieve better performances in the tasks of adapting to the new dynamics-changed environments than the original methods. Moreover, the proposed method shows superior performance over the baseline method just learning form the scratch in some original tasks.