多回合对话建模的多参与记忆网络

Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference Pub Date : 2019-06-22 DOI:10.1145/3341069.3342970

Jianlong Ren, Li Yang, Chun Zuo, Weiyi Kong, Xiaoxiao Ma

{"title":"多回合对话建模的多参与记忆网络","authors":"Jianlong Ren, Li Yang, Chun Zuo, Weiyi Kong, Xiaoxiao Ma","doi":"10.1145/3341069.3342970","DOIUrl":null,"url":null,"abstract":"Modeling and reasoning about the dialogue history is a main challenge for building a good multi-turn conversational agent. End-to-end memory networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from relatively low computational efficiency for its complex architectures and costly strong supervision information or fixed priori knowledge. This paper proposes a multi-head attention based end-to-end approach called multi-attending memory network without additional information or knowledge, which can effectively model and reason about multi-turn history dialogue. Specifically, a parallel multi-head attention mechanism is introduced to model conversational context via attending to different important sections of a full dialog. Thereafter, a stacked architecture with shortcut connections is presented to reason about the memory (the result of context modeling). Experiments on the bAbI-dialog datasets demonstrate the effectiveness of proposed approach.","PeriodicalId":411198,"journal":{"name":"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-attending Memory Network for Modeling Multi-turn Dialogue\",\"authors\":\"Jianlong Ren, Li Yang, Chun Zuo, Weiyi Kong, Xiaoxiao Ma\",\"doi\":\"10.1145/3341069.3342970\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modeling and reasoning about the dialogue history is a main challenge for building a good multi-turn conversational agent. End-to-end memory networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from relatively low computational efficiency for its complex architectures and costly strong supervision information or fixed priori knowledge. This paper proposes a multi-head attention based end-to-end approach called multi-attending memory network without additional information or knowledge, which can effectively model and reason about multi-turn history dialogue. Specifically, a parallel multi-head attention mechanism is introduced to model conversational context via attending to different important sections of a full dialog. Thereafter, a stacked architecture with shortcut connections is presented to reason about the memory (the result of context modeling). Experiments on the bAbI-dialog datasets demonstrate the effectiveness of proposed approach.\",\"PeriodicalId\":411198,\"journal\":{\"name\":\"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3341069.3342970\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341069.3342970","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

对话历史的建模和推理是构建一个好的多回合对话代理的主要挑战。具有循环或门控架构的端到端内存网络已被证明有望用于会话建模。然而，由于其复杂的体系结构和昂贵的强监督信息或固定的先验知识，仍然存在计算效率相对较低的问题。本文提出了一种基于多头注意的端到端多参与记忆网络方法，该方法无需额外的信息或知识，可以有效地对多回合历史对话进行建模和推理。具体来说，我们引入了一个平行的多头注意机制，通过关注完整对话的不同重要部分来模拟会话上下文。然后，提出了一个具有快捷连接的堆叠体系结构来对内存进行推理(上下文建模的结果)。在bAbI-dialog数据集上的实验证明了该方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Multi-attending Memory Network for Modeling Multi-turn Dialogue

Modeling and reasoning about the dialogue history is a main challenge for building a good multi-turn conversational agent. End-to-end memory networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from relatively low computational efficiency for its complex architectures and costly strong supervision information or fixed priori knowledge. This paper proposes a multi-head attention based end-to-end approach called multi-attending memory network without additional information or knowledge, which can effectively model and reason about multi-turn history dialogue. Specifically, a parallel multi-head attention mechanism is introduced to model conversational context via attending to different important sections of a full dialog. Thereafter, a stacked architecture with shortcut connections is presented to reason about the memory (the result of context modeling). Experiments on the bAbI-dialog datasets demonstrate the effectiveness of proposed approach.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference

自引率

0.00%

发文量