rnn在上下文表示中的作用:使用DMN-plus的案例研究

Y. Shen, E. Lai, Mahsa Mohaghegh
{"title":"rnn在上下文表示中的作用:使用DMN-plus的案例研究","authors":"Y. Shen, E. Lai, Mahsa Mohaghegh","doi":"10.1145/3440084.3441190","DOIUrl":null,"url":null,"abstract":"Recurrent neural networks (RNNs) have been used prevalently to capture long-term dependencies of sequential inputs. In particular, for question answering systems, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), allow the positional, ordering or contextual information to be encoded into latent contextual representations. While applying RNNs for encoding this information is intuitively reasonable, no specific research has been conducted to investigate how effective is their use in such systems when the sequence of sentences is unimportant. In this paper we conduct a case study on the effectiveness of using RNNs to generate context representations using the DMN+ network. Our results based on a three-fact task in the bAbI dataset show that sequences of facts in the training dataset influence the predictive performance of the trained system. We propose two methods to resolve this problem, one is data augmentation and the other is the optimization of the DMN+ structure by replacing the GRU in the episodic memory module with a non-recurrent operation. The experimental results demonstrate that our proposed solutions can resolve the problem effectively.","PeriodicalId":250100,"journal":{"name":"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"The Role of RNNs for Contextual Representations: A Case Study Using DMN-plus\",\"authors\":\"Y. Shen, E. Lai, Mahsa Mohaghegh\",\"doi\":\"10.1145/3440084.3441190\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recurrent neural networks (RNNs) have been used prevalently to capture long-term dependencies of sequential inputs. In particular, for question answering systems, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), allow the positional, ordering or contextual information to be encoded into latent contextual representations. While applying RNNs for encoding this information is intuitively reasonable, no specific research has been conducted to investigate how effective is their use in such systems when the sequence of sentences is unimportant. In this paper we conduct a case study on the effectiveness of using RNNs to generate context representations using the DMN+ network. Our results based on a three-fact task in the bAbI dataset show that sequences of facts in the training dataset influence the predictive performance of the trained system. We propose two methods to resolve this problem, one is data augmentation and the other is the optimization of the DMN+ structure by replacing the GRU in the episodic memory module with a non-recurrent operation. The experimental results demonstrate that our proposed solutions can resolve the problem effectively.\",\"PeriodicalId\":250100,\"journal\":{\"name\":\"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control\",\"volume\":\"5 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-11-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3440084.3441190\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 4th International Symposium on Computer Science and Intelligent Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3440084.3441190","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

递归神经网络(RNNs)已被广泛用于捕获顺序输入的长期依赖关系。特别是,对于问答系统,rnn的变体,如长短期记忆(LSTM)和门控循环单元(GRU),允许将位置、顺序或上下文信息编码为潜在的上下文表示。虽然应用rnn对这些信息进行编码在直觉上是合理的,但当句子的顺序不重要时,还没有具体的研究来调查它们在这种系统中的使用效果如何。在本文中,我们对使用rnn使用DMN+网络生成上下文表示的有效性进行了案例研究。我们基于bAbI数据集中的三事实任务的结果表明,训练数据集中的事实序列会影响训练系统的预测性能。我们提出了两种方法来解决这个问题,一种是数据增强,另一种是通过非循环操作替换情景记忆模块中的GRU来优化DMN+结构。实验结果表明,本文提出的解决方案能够有效地解决这一问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The Role of RNNs for Contextual Representations: A Case Study Using DMN-plus
Recurrent neural networks (RNNs) have been used prevalently to capture long-term dependencies of sequential inputs. In particular, for question answering systems, variants of RNNs, such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU), allow the positional, ordering or contextual information to be encoded into latent contextual representations. While applying RNNs for encoding this information is intuitively reasonable, no specific research has been conducted to investigate how effective is their use in such systems when the sequence of sentences is unimportant. In this paper we conduct a case study on the effectiveness of using RNNs to generate context representations using the DMN+ network. Our results based on a three-fact task in the bAbI dataset show that sequences of facts in the training dataset influence the predictive performance of the trained system. We propose two methods to resolve this problem, one is data augmentation and the other is the optimization of the DMN+ structure by replacing the GRU in the episodic memory module with a non-recurrent operation. The experimental results demonstrate that our proposed solutions can resolve the problem effectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信