{"title":"rnn在上下文表示中的作用:使用DMN-plus的案例研究","authors":"Y. Shen, E. Lai, Mahsa Mohaghegh","doi":"10.1145/3440084.3441190","DOIUrl":null,"url":null,"abstract":"Recurrent neural networks (RNNs) have been used prevalently to capture long-term dependencies of sequential inputs. In particular, for question answering systems, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), allow the positional, ordering or contextual information to be encoded into latent contextual representations. While applying RNNs for encoding this information is intuitively reasonable, no specific research has been conducted to investigate how effective is their use in such systems when the sequence of sentences is unimportant. In this paper we conduct a case study on the effectiveness of using RNNs to generate context representations using the DMN+ network. Our results based on a three-fact task in the bAbI dataset show that sequences of facts in the training dataset influence the predictive performance of the trained system. We propose two methods to resolve this problem, one is data augmentation and the other is the optimization of the DMN+ structure by replacing the GRU in the episodic memory module with a non-recurrent operation. The experimental results demonstrate that our proposed solutions can resolve the problem effectively.","PeriodicalId":250100,"journal":{"name":"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The Role of RNNs for Contextual Representations: A Case Study Using DMN-plus\",\"authors\":\"Y. Shen, E. Lai, Mahsa Mohaghegh\",\"doi\":\"10.1145/3440084.3441190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recurrent neural networks (RNNs) have been used prevalently to capture long-term dependencies of sequential inputs. In particular, for question answering systems, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), allow the positional, ordering or contextual information to be encoded into latent contextual representations. While applying RNNs for encoding this information is intuitively reasonable, no specific research has been conducted to investigate how effective is their use in such systems when the sequence of sentences is unimportant. In this paper we conduct a case study on the effectiveness of using RNNs to generate context representations using the DMN+ network. Our results based on a three-fact task in the bAbI dataset show that sequences of facts in the training dataset influence the predictive performance of the trained system. We propose two methods to resolve this problem, one is data augmentation and the other is the optimization of the DMN+ structure by replacing the GRU in the episodic memory module with a non-recurrent operation. The experimental results demonstrate that our proposed solutions can resolve the problem effectively.\",\"PeriodicalId\":250100,\"journal\":{\"name\":\"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3440084.3441190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3440084.3441190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Role of RNNs for Contextual Representations: A Case Study Using DMN-plus
Recurrent neural networks (RNNs) have been used prevalently to capture long-term dependencies of sequential inputs. In particular, for question answering systems, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), allow the positional, ordering or contextual information to be encoded into latent contextual representations. While applying RNNs for encoding this information is intuitively reasonable, no specific research has been conducted to investigate how effective is their use in such systems when the sequence of sentences is unimportant. In this paper we conduct a case study on the effectiveness of using RNNs to generate context representations using the DMN+ network. Our results based on a three-fact task in the bAbI dataset show that sequences of facts in the training dataset influence the predictive performance of the trained system. We propose two methods to resolve this problem, one is data augmentation and the other is the optimization of the DMN+ structure by replacing the GRU in the episodic memory module with a non-recurrent operation. The experimental results demonstrate that our proposed solutions can resolve the problem effectively.