EDA-RL概率模型的结构搜索与数据校正

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI:10.1109/ADPRL.2011.5967388

H. Handa

{"title":"EDA-RL概率模型的结构搜索与数据校正","authors":"H. Handa","doi":"10.1109/ADPRL.2011.5967388","DOIUrl":null,"url":null,"abstract":"We have proposed a novel Estimation of Distribution Algorithm for solving reinforcement learning problems: EDA-RL. The EDA-RL can perform well if the complexity of the structure of the probabilistic model is adapted to the difficulty of given problems. Therefore, this paper proposes a structure search method of the probabilistic model in the EDA-RL as in conventional EDA taking account multivariate dependencies. Moreover, a data correction method by eliminating loops of state transitions is also proposed. Computational simulations on maze problems, which have several perceptual aliasing states, show the effectiveness of the proposed method.","PeriodicalId":406195,"journal":{"name":"2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Structure search of probabilistic models and data correction for EDA-RL\",\"authors\":\"H. Handa\",\"doi\":\"10.1109/ADPRL.2011.5967388\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We have proposed a novel Estimation of Distribution Algorithm for solving reinforcement learning problems: EDA-RL. The EDA-RL can perform well if the complexity of the structure of the probabilistic model is adapted to the difficulty of given problems. Therefore, this paper proposes a structure search method of the probabilistic model in the EDA-RL as in conventional EDA taking account multivariate dependencies. Moreover, a data correction method by eliminating loops of state transitions is also proposed. Computational simulations on maze problems, which have several perceptual aliasing states, show the effectiveness of the proposed method.\",\"PeriodicalId\":406195,\"journal\":{\"name\":\"2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)\",\"volume\":\"85 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ADPRL.2011.5967388\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ADPRL.2011.5967388","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

我们提出了一种新的分布估计算法来解决强化学习问题:EDA-RL。如果概率模型结构的复杂性与给定问题的难度相适应，则EDA-RL可以很好地执行。因此，本文提出了一种考虑多变量相关性的EDA- rl概率模型的结构搜索方法。此外，还提出了一种消除状态转换环路的数据校正方法。对具有多种感知混叠状态的迷宫问题的计算仿真表明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Structure search of probabilistic models and data correction for EDA-RL

We have proposed a novel Estimation of Distribution Algorithm for solving reinforcement learning problems: EDA-RL. The EDA-RL can perform well if the complexity of the structure of the probabilistic model is adapted to the difficulty of given problems. Therefore, this paper proposes a structure search method of the probabilistic model in the EDA-RL as in conventional EDA taking account multivariate dependencies. Moreover, a data correction method by eliminating loops of state transitions is also proposed. Computational simulations on maze problems, which have several perceptual aliasing states, show the effectiveness of the proposed method.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)

自引率

0.00%

发文量