Learning to deal with objects

2009 IEEE 8th International Conference on Development and Learning Pub Date : 2009-06-05 DOI:10.1109/DEVLRN.2009.5175508

M. Malfaz, M. Salichs

引用次数: 10

Abstract

In this paper, a modification of the standard learning algorithm Q-learning is presented: Object Q-learning (OQ-learning). An autonomous agent should be able to decide its own goals and behaviours in order to fulfil these goals. When the agent has no previous knowledge, it must learn what to do in every state (policy of behaviour). If the agent uses Q-learning, this implies that it learns the utility value Q of each action-state pair. Typically, an autonomous agent living in a complex environment has to interact with different objects present in that world. In this case, the number of states of the agent in relation to those objects may increase as the number of objects increases, making the learning process difficult to deal with. The proposed modification appears as a solution in order to cope with this problem. The experimental results prove the usefulness of the OQ-learning in this situation, in comparison with the standard Q-learning algorithm.

查看原文本刊更多论文

学习处理物品

本文提出了对标准学习算法Q-learning的改进:对象Q-learning (Object Q-learning, OQ-learning)。为了实现这些目标，自主代理应该能够决定自己的目标和行为。当代理没有先前的知识时，它必须学习在每个状态下该做什么(行为策略)。如果智能体使用Q-learning，这意味着它学习每个动作状态对的效用值Q。通常，生活在复杂环境中的自主代理必须与该世界中存在的不同对象进行交互。在这种情况下，智能体与这些对象相关的状态数量可能会随着对象数量的增加而增加，从而使学习过程难以处理。拟议的修改似乎是解决这一问题的一种办法。与标准的q -学习算法相比，实验结果证明了oq学习在这种情况下的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE 8th International Conference on Development and Learning

自引率

0.00%

发文量