Effect of human guidance and state space size on Interactive Reinforcement Learning

2011 RO-MAN Pub Date : 2011-08-30 DOI:10.1109/ROMAN.2011.6005223

Halit Bener Suay, S. Chernova

引用次数: 110

Abstract

The Interactive Reinforcement Learning algorithm enables a human user to train a robot by providing rewards in response to past actions and anticipatory guidance to guide the selection of future actions. Past work with software agents has shown that incorporating user guidance into the policy learning process through Interactive Reinforcement Learning significantly improves the policy learning time by reducing the number of states the agent explores. We present the first study of Interactive Reinforcement Learning in real-world robotic systems. We report on four experiments that study the effects that teacher guidance and state space size have on policy learning performance. We discuss modifications made to apply Interactive Reinforcement Learning to a real-world system and show that guidance significantly reduces the learning rate, and that its positive effects increase with state space size.

查看原文本刊更多论文

人的引导和状态空间大小对交互式强化学习的影响

交互式强化学习算法使人类用户能够通过对过去的行为提供奖励来训练机器人，并通过预期指导来指导未来行动的选择。过去对软件代理的研究表明，通过交互式强化学习将用户指导纳入策略学习过程，通过减少代理探索的状态数量，显著提高了策略学习时间。我们在现实世界的机器人系统中首次提出了交互式强化学习的研究。我们报告了四个实验，研究了教师指导和状态空间大小对策略学习绩效的影响。我们讨论了将交互式强化学习应用于现实世界系统的修改，并表明指导显着降低了学习率，并且其积极影响随着状态空间的大小而增加。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2011 RO-MAN

自引率

0.00%

发文量