对话环境不同于游戏:研究深度q网络的对话策略变体

Yu-An Wang, Yun-Nung (Vivian) Chen
{"title":"对话环境不同于游戏:研究深度q网络的对话策略变体","authors":"Yu-An Wang, Yun-Nung (Vivian) Chen","doi":"10.1109/ASRU46091.2019.9003840","DOIUrl":null,"url":null,"abstract":"The dialogue manager is an important component in a task-oriented dialogue system, which focuses on deciding dialogue policy given the dialogue state in order to fulfill the user goal. Learning dialogue policy is usually framed as a reinforcement learning (RL) problem, where the objective is to maximize the reward indicating whether the conversation is successful and how efficient it is. However, even there are many variants of deep Q-networks (DQN) achieving better performance on game playing scenarios, no prior work analyzed the performance of dialogue policy learning using these improved versions. Considering that dialogue interactions differ a lot from game playing, this paper investigates variants of DQN models together with different exploration strategies in a benchmark experimental setup, and then we examine which RL methods are more suitable for task-completion dialogue policy learning11The code is available at https://github.com/MiuLab/Dialogue-DQN-Variants.","PeriodicalId":150913,"journal":{"name":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Dialogue Environments are Different from Games: Investigating Variants of Deep Q-Networks for Dialogue Policy\",\"authors\":\"Yu-An Wang, Yun-Nung (Vivian) Chen\",\"doi\":\"10.1109/ASRU46091.2019.9003840\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The dialogue manager is an important component in a task-oriented dialogue system, which focuses on deciding dialogue policy given the dialogue state in order to fulfill the user goal. Learning dialogue policy is usually framed as a reinforcement learning (RL) problem, where the objective is to maximize the reward indicating whether the conversation is successful and how efficient it is. However, even there are many variants of deep Q-networks (DQN) achieving better performance on game playing scenarios, no prior work analyzed the performance of dialogue policy learning using these improved versions. Considering that dialogue interactions differ a lot from game playing, this paper investigates variants of DQN models together with different exploration strategies in a benchmark experimental setup, and then we examine which RL methods are more suitable for task-completion dialogue policy learning11The code is available at https://github.com/MiuLab/Dialogue-DQN-Variants.\",\"PeriodicalId\":150913,\"journal\":{\"name\":\"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU46091.2019.9003840\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU46091.2019.9003840","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

对话管理器是面向任务的对话系统的重要组成部分,它关注于在给定的对话状态下决定对话策略,以实现用户的目标。学习对话策略通常被定义为强化学习(RL)问题,其目标是最大化表明对话是否成功及其效率的奖励。然而,即使有许多深度q网络(DQN)的变体在游戏场景中取得了更好的性能,但之前没有工作使用这些改进版本分析对话策略学习的性能。考虑到对话交互与游戏有很大的不同,本文在基准实验设置中研究了DQN模型的变体以及不同的探索策略,然后我们检查了哪种强化学习方法更适合于任务完成对话策略学习11代码可在https://github.com/MiuLab/Dialogue-DQN-Variants获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Dialogue Environments are Different from Games: Investigating Variants of Deep Q-Networks for Dialogue Policy
The dialogue manager is an important component in a task-oriented dialogue system, which focuses on deciding dialogue policy given the dialogue state in order to fulfill the user goal. Learning dialogue policy is usually framed as a reinforcement learning (RL) problem, where the objective is to maximize the reward indicating whether the conversation is successful and how efficient it is. However, even there are many variants of deep Q-networks (DQN) achieving better performance on game playing scenarios, no prior work analyzed the performance of dialogue policy learning using these improved versions. Considering that dialogue interactions differ a lot from game playing, this paper investigates variants of DQN models together with different exploration strategies in a benchmark experimental setup, and then we examine which RL methods are more suitable for task-completion dialogue policy learning11The code is available at https://github.com/MiuLab/Dialogue-DQN-Variants.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信