Takashi Nagata, Jinwei Xing, Tsutomu Kumazawa, E. Neftci
{"title":"基于强化学习的不确定性感知模型集成","authors":"Takashi Nagata, Jinwei Xing, Tsutomu Kumazawa, E. Neftci","doi":"10.1109/IJCNN55064.2022.9892778","DOIUrl":null,"url":null,"abstract":"Model-based reinforcement learning is an effective approach to reducing sample complexity by adding more data from the model. Dyna is a well-known architecture that contains model-based reinforcement learning and integrates learning from interactions with an environment and a model of the environment. Although the model can greatly help to speed up the agent's learning, acquiring an accurate model is a hard problem in spite of the recent great success of function approximation using neural networks. A wrong model causes degradation of the agent's performance and raises another question: to which extent should an agent rely on the model to update its policy? In this paper, we propose to use the confidence of the model simulations to the integrated learning process so that the agent avoids updating its policy based on uncertain simulations by the model. To obtain confidence, we apply the Monte Carlo dropout technique to the state transition model. We show that this approach contributes to improving early-stage training, thus helping speed up the agent to reach reasonable performance. We conduct experiments on simulated robotic locomotion tasks to demonstrate the effectiveness of our approach.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Uncertainty Aware Model Integration on Reinforcement Learning\",\"authors\":\"Takashi Nagata, Jinwei Xing, Tsutomu Kumazawa, E. Neftci\",\"doi\":\"10.1109/IJCNN55064.2022.9892778\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Model-based reinforcement learning is an effective approach to reducing sample complexity by adding more data from the model. Dyna is a well-known architecture that contains model-based reinforcement learning and integrates learning from interactions with an environment and a model of the environment. Although the model can greatly help to speed up the agent's learning, acquiring an accurate model is a hard problem in spite of the recent great success of function approximation using neural networks. A wrong model causes degradation of the agent's performance and raises another question: to which extent should an agent rely on the model to update its policy? In this paper, we propose to use the confidence of the model simulations to the integrated learning process so that the agent avoids updating its policy based on uncertain simulations by the model. To obtain confidence, we apply the Monte Carlo dropout technique to the state transition model. We show that this approach contributes to improving early-stage training, thus helping speed up the agent to reach reasonable performance. We conduct experiments on simulated robotic locomotion tasks to demonstrate the effectiveness of our approach.\",\"PeriodicalId\":106974,\"journal\":{\"name\":\"2022 International Joint Conference on Neural Networks (IJCNN)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Joint Conference on Neural Networks (IJCNN)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IJCNN55064.2022.9892778\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN55064.2022.9892778","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Uncertainty Aware Model Integration on Reinforcement Learning
Model-based reinforcement learning is an effective approach to reducing sample complexity by adding more data from the model. Dyna is a well-known architecture that contains model-based reinforcement learning and integrates learning from interactions with an environment and a model of the environment. Although the model can greatly help to speed up the agent's learning, acquiring an accurate model is a hard problem in spite of the recent great success of function approximation using neural networks. A wrong model causes degradation of the agent's performance and raises another question: to which extent should an agent rely on the model to update its policy? In this paper, we propose to use the confidence of the model simulations to the integrated learning process so that the agent avoids updating its policy based on uncertain simulations by the model. To obtain confidence, we apply the Monte Carlo dropout technique to the state transition model. We show that this approach contributes to improving early-stage training, thus helping speed up the agent to reach reasonable performance. We conduct experiments on simulated robotic locomotion tasks to demonstrate the effectiveness of our approach.