对说话者角色和语境的动态时间意识关注有助于口语理解

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU) Pub Date : 2017-09-30 DOI:10.1109/ASRU.2017.8268985

Po-Chun Chen, Ta-Chung Chi, Shang-Yu Su, Yun-Nung (Vivian) Chen

{"title":"对说话者角色和语境的动态时间意识关注有助于口语理解","authors":"Po-Chun Chen, Ta-Chung Chi, Shang-Yu Su, Yun-Nung (Vivian) Chen","doi":"10.1109/ASRU.2017.8268985","DOIUrl":null,"url":null,"abstract":"Spoken language understanding (SLU) is an essential component in conversational systems. Most SLU component treats each utterance independently, and then the following components aggregate the multi-turn information in the separate phases. In order to avoid error propagation and effectively utilize contexts, prior work leveraged history for contextual SLU. However, the previous model only paid attention to the content in history utterances without considering their temporal information and speaker roles. In the dialogues, the most recent utterances should be more important than the least recent ones. Furthermore, users usually pay attention to 1) self history for reasoning and 2) others utterances for listening, the speaker of the utterances may provides informative cues to help understanding. Therefore, this paper proposes an attention-based network that additionally leverages temporal information and speaker role for better SLU, where the attention to contexts and speaker roles can be automatically learned in an end-to-end manner. The experiments on the benchmark Dialogue State Tracking Challenge 4 (DSTC4) dataset show that the time-aware dynamic role attention networks significantly improve the understanding performance1.","PeriodicalId":290868,"journal":{"name":"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","volume":"133 2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"26","resultStr":"{\"title\":\"Dynamic time-aware attention to speaker roles and contexts for spoken language understanding\",\"authors\":\"Po-Chun Chen, Ta-Chung Chi, Shang-Yu Su, Yun-Nung (Vivian) Chen\",\"doi\":\"10.1109/ASRU.2017.8268985\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Spoken language understanding (SLU) is an essential component in conversational systems. Most SLU component treats each utterance independently, and then the following components aggregate the multi-turn information in the separate phases. In order to avoid error propagation and effectively utilize contexts, prior work leveraged history for contextual SLU. However, the previous model only paid attention to the content in history utterances without considering their temporal information and speaker roles. In the dialogues, the most recent utterances should be more important than the least recent ones. Furthermore, users usually pay attention to 1) self history for reasoning and 2) others utterances for listening, the speaker of the utterances may provides informative cues to help understanding. Therefore, this paper proposes an attention-based network that additionally leverages temporal information and speaker role for better SLU, where the attention to contexts and speaker roles can be automatically learned in an end-to-end manner. The experiments on the benchmark Dialogue State Tracking Challenge 4 (DSTC4) dataset show that the time-aware dynamic role attention networks significantly improve the understanding performance1.\",\"PeriodicalId\":290868,\"journal\":{\"name\":\"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"volume\":\"133 2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"26\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASRU.2017.8268985\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASRU.2017.8268985","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 26

摘要

口语理解(SLU)是会话系统的重要组成部分。大多数SLU分量对每个话语进行独立处理，随后的分量将多回合信息聚合在不同的相位中。为了避免错误传播并有效地利用上下文，之前的工作利用了上下文SLU的历史。然而，以往的模型只关注历史话语的内容，而没有考虑历史话语的时间信息和说话人的角色。在对话中，最近的话语应该比最近的话语更重要。此外，用户通常会注意1)自己的历史来进行推理，2)别人的话语来倾听，话语的说话者可能会提供信息线索来帮助理解。因此，本文提出了一个基于注意的网络，该网络额外利用时间信息和说话人角色来实现更好的SLU，其中对上下文和说话人角色的注意可以以端到端方式自动学习。在基准对话状态跟踪挑战4 (DSTC4)数据集上的实验表明，时间感知的动态角色关注网络显著提高了理解性能1。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Dynamic time-aware attention to speaker roles and contexts for spoken language understanding

Spoken language understanding (SLU) is an essential component in conversational systems. Most SLU component treats each utterance independently, and then the following components aggregate the multi-turn information in the separate phases. In order to avoid error propagation and effectively utilize contexts, prior work leveraged history for contextual SLU. However, the previous model only paid attention to the content in history utterances without considering their temporal information and speaker roles. In the dialogues, the most recent utterances should be more important than the least recent ones. Furthermore, users usually pay attention to 1) self history for reasoning and 2) others utterances for listening, the speaker of the utterances may provides informative cues to help understanding. Therefore, this paper proposes an attention-based network that additionally leverages temporal information and speaker role for better SLU, where the attention to contexts and speaker roles can be automatically learned in an end-to-end manner. The experiments on the benchmark Dialogue State Tracking Challenge 4 (DSTC4) dataset show that the time-aware dynamic role attention networks significantly improve the understanding performance1.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)

自引率

0.00%

发文量