自适应模糊逻辑规则指导下加速深度强化学习

2022 Prognostics and Health Management Conference (PHM-2022 London) Pub Date : 2022-05-01 DOI:10.1109/phm2022-london52454.2022.00068

Min Wang, Xingzhong Wang, Wei Luo, Yixue Huang, Yuanqiang Yu

{"title":"自适应模糊逻辑规则指导下加速深度强化学习","authors":"Min Wang, Xingzhong Wang, Wei Luo, Yixue Huang, Yuanqiang Yu","doi":"10.1109/phm2022-london52454.2022.00068","DOIUrl":null,"url":null,"abstract":"While Deep Reinforcement Learning (DRL) has emerged as a prospective method to many tough tasks, it remains laborious to train DRL agents with a handful of data collection and high sample efficiency. In this paper, we present an Adaptive Fuzzy Reinforcement Learning framework (AFuRL) for accelerating the learning process by incorporating adaptive fuzzy logic rules, enabling DRL agents to improve the efficiency of exploring the state space. In AFuRL, the DRL agent first leverages prior fuzzy logic rules designed especially for the actor-critic framework to learn some near-optimal policies, then further improves these policies by automatically generating adaptive fuzzy rules from state-action pairs. Ultimately, the RL algorithm is applied to refine the rough policy obtained by a fuzzy controller. We demonstrate the validity of AFuRL in both discrete and continuous control tasks, where our method surpasses DRL algorithms by a substantial margin. The experiment results show that AFuRL can find superior policies in comparison with imitation-based and some prior knowledge-based approaches.","PeriodicalId":269605,"journal":{"name":"2022 Prognostics and Health Management Conference (PHM-2022 London)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Accelerating Deep Reinforcement Learning Under the Guidance of Adaptive Fuzzy Logic Rules\",\"authors\":\"Min Wang, Xingzhong Wang, Wei Luo, Yixue Huang, Yuanqiang Yu\",\"doi\":\"10.1109/phm2022-london52454.2022.00068\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"While Deep Reinforcement Learning (DRL) has emerged as a prospective method to many tough tasks, it remains laborious to train DRL agents with a handful of data collection and high sample efficiency. In this paper, we present an Adaptive Fuzzy Reinforcement Learning framework (AFuRL) for accelerating the learning process by incorporating adaptive fuzzy logic rules, enabling DRL agents to improve the efficiency of exploring the state space. In AFuRL, the DRL agent first leverages prior fuzzy logic rules designed especially for the actor-critic framework to learn some near-optimal policies, then further improves these policies by automatically generating adaptive fuzzy rules from state-action pairs. Ultimately, the RL algorithm is applied to refine the rough policy obtained by a fuzzy controller. We demonstrate the validity of AFuRL in both discrete and continuous control tasks, where our method surpasses DRL algorithms by a substantial margin. The experiment results show that AFuRL can find superior policies in comparison with imitation-based and some prior knowledge-based approaches.\",\"PeriodicalId\":269605,\"journal\":{\"name\":\"2022 Prognostics and Health Management Conference (PHM-2022 London)\",\"volume\":\"31 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 Prognostics and Health Management Conference (PHM-2022 London)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/phm2022-london52454.2022.00068\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 Prognostics and Health Management Conference (PHM-2022 London)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/phm2022-london52454.2022.00068","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

虽然深度强化学习(DRL)已经成为许多艰巨任务的一种有前景的方法，但由于数据收集量少，样本效率高，训练DRL智能体仍然很费力。在本文中，我们提出了一个自适应模糊强化学习框架(AFuRL)，通过引入自适应模糊逻辑规则来加速学习过程，使DRL代理能够提高探索状态空间的效率。在AFuRL中，DRL代理首先利用专门为参与者-批评者框架设计的先验模糊逻辑规则来学习一些接近最优的策略，然后通过从状态-动作对中自动生成自适应模糊规则来进一步改进这些策略。最后，应用强化学习算法对模糊控制器得到的粗糙策略进行细化。我们证明了AFuRL在离散和连续控制任务中的有效性，我们的方法在很大程度上超过了DRL算法。实验结果表明，与基于模仿和一些基于先验知识的方法相比，AFuRL可以找到更优的策略。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Accelerating Deep Reinforcement Learning Under the Guidance of Adaptive Fuzzy Logic Rules

While Deep Reinforcement Learning (DRL) has emerged as a prospective method to many tough tasks, it remains laborious to train DRL agents with a handful of data collection and high sample efficiency. In this paper, we present an Adaptive Fuzzy Reinforcement Learning framework (AFuRL) for accelerating the learning process by incorporating adaptive fuzzy logic rules, enabling DRL agents to improve the efficiency of exploring the state space. In AFuRL, the DRL agent first leverages prior fuzzy logic rules designed especially for the actor-critic framework to learn some near-optimal policies, then further improves these policies by automatically generating adaptive fuzzy rules from state-action pairs. Ultimately, the RL algorithm is applied to refine the rough policy obtained by a fuzzy controller. We demonstrate the validity of AFuRL in both discrete and continuous control tasks, where our method surpasses DRL algorithms by a substantial margin. The experiment results show that AFuRL can find superior policies in comparison with imitation-based and some prior knowledge-based approaches.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 Prognostics and Health Management Conference (PHM-2022 London)

自引率

0.00%

发文量