A Model Based Reinforcement Learning Approach Using On-Line Clustering

2012 IEEE 24th International Conference on Tools with Artificial Intelligence Pub Date : 2012-11-07 DOI:10.1109/ICTAI.2012.101

Nikolaos Tziortziotis, K. Blekas

引用次数: 7

Abstract

A significant issue in representing reinforcement learning agents in Markov decision processes is how to design efficient feature spaces in order to estimate optimal policy. This particular study addresses this challenge by proposing a compact framework that employs an on-line clustering approach for constructing appropriate basis functions. Also, it performs a state-action trajectory analysis to gain valuable affinity information among clusters and estimate their transition dynamics. Value function approximation is used for policy evaluation in a least-squares temporal difference framework. The proposed method is evaluated in several simulated and real environments, where we took promising results.

查看原文本刊更多论文

一种基于模型的在线聚类强化学习方法

在马尔可夫决策过程中表示强化学习智能体的一个重要问题是如何设计有效的特征空间以估计最优策略。本研究通过提出一个紧凑的框架来解决这一挑战，该框架采用在线聚类方法来构建适当的基函数。此外，它执行状态-行为轨迹分析，以获得集群之间有价值的亲和力信息，并估计它们的过渡动态。在最小二乘时间差分框架中，将值函数近似用于策略评估。在多个模拟和真实环境中对所提出的方法进行了评估，取得了令人满意的结果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2012 IEEE 24th International Conference on Tools with Artificial Intelligence

自引率

0.00%

发文量