Representing the Reinforcement Learning state in a negotiation dialogue

2009 IEEE Workshop on Automatic Speech Recognition & Understanding Pub Date : 2009-12-01 DOI:10.1109/ASRU.2009.5373413

P. Heeman

引用次数: 27

Abstract

Most applications of Reinforcement Learning (RL) for dialogue have focused on slot-filling tasks. In this paper, we explore a task that requires negotiation, in which conversants need to exchange information in order to decide on a good solution. We investigate what information should be included in the system's RL state so that an optimal policy can be learned and so that the state space stays reasonable in size. We propose keeping track of the decisions that the system has made, and using them to constrain the system's future behavior in the dialogue. In this way, we can compositionally represent the strategy that the system is employing. We show that this approach is able to learn a good policy for the task. This work is a first step to a more general exploration of applying RL to negotiation dialogues.

查看原文本刊更多论文

在协商对话中表示强化学习状态

强化学习(RL)在对话中的大多数应用都集中在补槽任务上。在本文中，我们探讨了一个需要协商的任务，在这个任务中，熟悉的人需要交换信息来决定一个好的解决方案。我们研究什么信息应该包含在系统的RL状态中，以便可以学习到最优策略，并使状态空间保持合理的大小。我们建议跟踪系统所做的决策，并使用它们来约束系统在对话中的未来行为。通过这种方式，我们可以组合地表示系统所采用的策略。我们证明了这种方法能够为任务学习到一个好的策略。这项工作是将强化学习应用于谈判对话的更广泛探索的第一步。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2009 IEEE Workshop on Automatic Speech Recognition & Understanding

自引率

0.00%

发文量