Uncertainty Aware Model Integration on Reinforcement Learning

Takashi Nagata, Jinwei Xing, Tsutomu Kumazawa, E. Neftci
{"title":"Uncertainty Aware Model Integration on Reinforcement Learning","authors":"Takashi Nagata, Jinwei Xing, Tsutomu Kumazawa, E. Neftci","doi":"10.1109/IJCNN55064.2022.9892778","DOIUrl":null,"url":null,"abstract":"Model-based reinforcement learning is an effective approach to reducing sample complexity by adding more data from the model. Dyna is a well-known architecture that contains model-based reinforcement learning and integrates learning from interactions with an environment and a model of the environment. Although the model can greatly help to speed up the agent's learning, acquiring an accurate model is a hard problem in spite of the recent great success of function approximation using neural networks. A wrong model causes degradation of the agent's performance and raises another question: to which extent should an agent rely on the model to update its policy? In this paper, we propose to use the confidence of the model simulations to the integrated learning process so that the agent avoids updating its policy based on uncertain simulations by the model. To obtain confidence, we apply the Monte Carlo dropout technique to the state transition model. We show that this approach contributes to improving early-stage training, thus helping speed up the agent to reach reasonable performance. We conduct experiments on simulated robotic locomotion tasks to demonstrate the effectiveness of our approach.","PeriodicalId":106974,"journal":{"name":"2022 International Joint Conference on Neural Networks (IJCNN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Joint Conference on Neural Networks (IJCNN)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IJCNN55064.2022.9892778","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Model-based reinforcement learning is an effective approach to reducing sample complexity by adding more data from the model. Dyna is a well-known architecture that contains model-based reinforcement learning and integrates learning from interactions with an environment and a model of the environment. Although the model can greatly help to speed up the agent's learning, acquiring an accurate model is a hard problem in spite of the recent great success of function approximation using neural networks. A wrong model causes degradation of the agent's performance and raises another question: to which extent should an agent rely on the model to update its policy? In this paper, we propose to use the confidence of the model simulations to the integrated learning process so that the agent avoids updating its policy based on uncertain simulations by the model. To obtain confidence, we apply the Monte Carlo dropout technique to the state transition model. We show that this approach contributes to improving early-stage training, thus helping speed up the agent to reach reasonable performance. We conduct experiments on simulated robotic locomotion tasks to demonstrate the effectiveness of our approach.
基于强化学习的不确定性感知模型集成
基于模型的强化学习是一种通过从模型中添加更多数据来降低样本复杂度的有效方法。Dyna是一个著名的架构,它包含基于模型的强化学习,并集成了从与环境和环境模型的交互中学习。尽管该模型可以极大地加快智能体的学习速度,但尽管近年来使用神经网络的函数逼近取得了巨大的成功,但获取准确的模型仍然是一个难题。一个错误的模型会导致代理的性能下降,并提出另一个问题:代理应该在多大程度上依赖模型来更新其策略?在本文中,我们提出将模型模拟的置信度用于集成学习过程,以避免智能体根据模型的不确定模拟更新其策略。为了获得置信度,我们将蒙特卡罗dropout技术应用于状态转移模型。我们表明,这种方法有助于改善早期训练,从而有助于加速智能体达到合理的性能。我们进行了模拟机器人运动任务的实验,以证明我们的方法的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信