Evolutionary value function approximation

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) Pub Date : 2011-04-11 DOI:10.1109/ADPRL.2011.5967349

M. Davarynejad, J. V. Ast, J. Vrancken, J. Berg

引用次数: 5

Abstract

The standard reinforcement learning algorithms have proven to be effective tools for letting an agent learn from its experiences generated by its interaction with an environment. In this paper an evolutionary approach is proposed to accelerate learning speed in tabular reinforcement learning algorithms. In the proposed approach, in order to accelerate the learning speed of agents, the state-value is not only approximated, but through using the concept of evolutionary algorithms, they are evolved, with extra bonus of giving each agent the opportunity to exchange its knowledge. The proposed evolutionary value function approximation, helps in moving from a single isolated learning stage to cooperative exploration of the search space and accelerating learning speed. The performance of the proposed algorithm is compared with the standard SARSA algorithm and some of its properties are discussed. The experimental analysis confirms that the proposed approach has higher convergent speed with a negligible increase in computational complexity.

查看原文本刊更多论文

演化值函数近似

标准的强化学习算法已被证明是让智能体从其与环境交互产生的经验中学习的有效工具。本文提出了一种进化方法来提高表格强化学习算法的学习速度。在该方法中，为了加快智能体的学习速度，不仅对状态值进行近似，而且通过使用进化算法的概念对状态值进行进化，并给予每个智能体交换知识的机会。提出的演化值函数近似，有助于从单个孤立的学习阶段过渡到搜索空间的合作探索，加快学习速度。将该算法的性能与标准SARSA算法进行了比较，并讨论了其一些特性。实验分析表明，该方法具有较快的收敛速度，而计算复杂度的增加可以忽略不计。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL)

自引率

0.00%

发文量