{"title":"基于贝叶斯结构化探索的模型不可知元强化学习","authors":"Haonan Wang, Yiyun Zhang, Dawei Feng, Dongsheng Li, Feng Huang","doi":"10.1109/SCC49832.2020.00017","DOIUrl":null,"url":null,"abstract":"Deep reinforcement learning (RL) is playing an increasingly important role in web services such as news recommendation, vulnerability detection, and personalized services. Exploration is a key component of RL, which determines whether these RL-based applications could find effective solutions eventually. In this paper, we propose a novel gradient–based fast adaptation approach for model agnostic meta-reinforcement learning via Bayesian structure exploration (BSE-MAML). BSE-MAML could effectively learn exploration strategies from prior experience by updating policy with embedding latent space via a Bayesian mechanism. Coherent stochasticity injected by latent space are more efficient than random noise, and can produce exploration strategies to perform well in novel environment. We have conducted extensive experiments to evaluate BSE-MAML. Experimental results show that BSE-MAML achieves better performance in exploration in realistic environments with sparse rewards, compared to state-of-the-art meta-RL algorithms, RL methods without learning exploration strategies, and task-agnostic exploration approaches.","PeriodicalId":274909,"journal":{"name":"2020 IEEE International Conference on Services Computing (SCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"BSE-MAML: Model Agnostic Meta-Reinforcement Learning via Bayesian Structured Exploration\",\"authors\":\"Haonan Wang, Yiyun Zhang, Dawei Feng, Dongsheng Li, Feng Huang\",\"doi\":\"10.1109/SCC49832.2020.00017\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep reinforcement learning (RL) is playing an increasingly important role in web services such as news recommendation, vulnerability detection, and personalized services. Exploration is a key component of RL, which determines whether these RL-based applications could find effective solutions eventually. In this paper, we propose a novel gradient–based fast adaptation approach for model agnostic meta-reinforcement learning via Bayesian structure exploration (BSE-MAML). BSE-MAML could effectively learn exploration strategies from prior experience by updating policy with embedding latent space via a Bayesian mechanism. Coherent stochasticity injected by latent space are more efficient than random noise, and can produce exploration strategies to perform well in novel environment. We have conducted extensive experiments to evaluate BSE-MAML. Experimental results show that BSE-MAML achieves better performance in exploration in realistic environments with sparse rewards, compared to state-of-the-art meta-RL algorithms, RL methods without learning exploration strategies, and task-agnostic exploration approaches.\",\"PeriodicalId\":274909,\"journal\":{\"name\":\"2020 IEEE International Conference on Services Computing (SCC)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Services Computing (SCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SCC49832.2020.00017\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Services Computing (SCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SCC49832.2020.00017","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
BSE-MAML: Model Agnostic Meta-Reinforcement Learning via Bayesian Structured Exploration
Deep reinforcement learning (RL) is playing an increasingly important role in web services such as news recommendation, vulnerability detection, and personalized services. Exploration is a key component of RL, which determines whether these RL-based applications could find effective solutions eventually. In this paper, we propose a novel gradient–based fast adaptation approach for model agnostic meta-reinforcement learning via Bayesian structure exploration (BSE-MAML). BSE-MAML could effectively learn exploration strategies from prior experience by updating policy with embedding latent space via a Bayesian mechanism. Coherent stochasticity injected by latent space are more efficient than random noise, and can produce exploration strategies to perform well in novel environment. We have conducted extensive experiments to evaluate BSE-MAML. Experimental results show that BSE-MAML achieves better performance in exploration in realistic environments with sparse rewards, compared to state-of-the-art meta-RL algorithms, RL methods without learning exploration strategies, and task-agnostic exploration approaches.