Temporal difference learning with interpolated table value functions

2009 IEEE Symposium on Computational Intelligence and Games Pub Date : 2009-09-07 DOI:10.1109/CIG.2009.5286496

S. Lucas

引用次数: 6

Abstract

This paper introduces a novel function approximation architecture especially well suited to temporal difference learning. The architecture is based on using sets of interpolated table look-up functions. These offer rapid and stable learning, and are efficient when the number of inputs is small. An empirical investigation is conducted to test their performance on a supervised learning task, and on themountain car problem, a standard reinforcement learning benchmark. In each case, the interpolated table functions offer competitive performance.

查看原文本刊更多论文

插值表值函数的时间差分学习

本文介绍了一种特别适合于时间差分学习的新的函数逼近结构。该体系结构基于使用插值表查找函数集。它们提供了快速和稳定的学习，并且在输入数量较少时效率很高。我们进行了一项实证调查，以测试他们在监督学习任务上的表现，以及在山地车问题上的表现，这是一个标准的强化学习基准。在每种情况下，内插表函数都提供了具有竞争力的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE Symposium on Computational Intelligence and Games

自引率

0.00%

发文量