Jianlong Ren, Li Yang, Chun Zuo, Weiyi Kong, Xiaoxiao Ma
{"title":"多回合对话建模的多参与记忆网络","authors":"Jianlong Ren, Li Yang, Chun Zuo, Weiyi Kong, Xiaoxiao Ma","doi":"10.1145/3341069.3342970","DOIUrl":null,"url":null,"abstract":"Modeling and reasoning about the dialogue history is a main challenge for building a good multi-turn conversational agent. End-to-end memory networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from relatively low computational efficiency for its complex architectures and costly strong supervision information or fixed priori knowledge. This paper proposes a multi-head attention based end-to-end approach called multi-attending memory network without additional information or knowledge, which can effectively model and reason about multi-turn history dialogue. Specifically, a parallel multi-head attention mechanism is introduced to model conversational context via attending to different important sections of a full dialog. Thereafter, a stacked architecture with shortcut connections is presented to reason about the memory (the result of context modeling). Experiments on the bAbI-dialog datasets demonstrate the effectiveness of proposed approach.","PeriodicalId":411198,"journal":{"name":"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Multi-attending Memory Network for Modeling Multi-turn Dialogue\",\"authors\":\"Jianlong Ren, Li Yang, Chun Zuo, Weiyi Kong, Xiaoxiao Ma\",\"doi\":\"10.1145/3341069.3342970\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modeling and reasoning about the dialogue history is a main challenge for building a good multi-turn conversational agent. End-to-end memory networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from relatively low computational efficiency for its complex architectures and costly strong supervision information or fixed priori knowledge. This paper proposes a multi-head attention based end-to-end approach called multi-attending memory network without additional information or knowledge, which can effectively model and reason about multi-turn history dialogue. Specifically, a parallel multi-head attention mechanism is introduced to model conversational context via attending to different important sections of a full dialog. Thereafter, a stacked architecture with shortcut connections is presented to reason about the memory (the result of context modeling). Experiments on the bAbI-dialog datasets demonstrate the effectiveness of proposed approach.\",\"PeriodicalId\":411198,\"journal\":{\"name\":\"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3341069.3342970\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 3rd High Performance Computing and Cluster Technologies Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3341069.3342970","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-attending Memory Network for Modeling Multi-turn Dialogue
Modeling and reasoning about the dialogue history is a main challenge for building a good multi-turn conversational agent. End-to-end memory networks with recurrent or gated architectures have been demonstrated promising for conversation modeling. However, it still suffers from relatively low computational efficiency for its complex architectures and costly strong supervision information or fixed priori knowledge. This paper proposes a multi-head attention based end-to-end approach called multi-attending memory network without additional information or knowledge, which can effectively model and reason about multi-turn history dialogue. Specifically, a parallel multi-head attention mechanism is introduced to model conversational context via attending to different important sections of a full dialog. Thereafter, a stacked architecture with shortcut connections is presented to reason about the memory (the result of context modeling). Experiments on the bAbI-dialog datasets demonstrate the effectiveness of proposed approach.