Implicit n-grams Induced by Recurrence

Xiaobing Sun, Wei Lu
{"title":"Implicit n-grams Induced by Recurrence","authors":"Xiaobing Sun, Wei Lu","doi":"10.48550/arXiv.2205.02724","DOIUrl":null,"url":null,"abstract":"Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP)tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may promptre-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on handling sequential data. Despite manyprior attempts to interpret RNNs, their internal mechanisms have not been fully understood, and the question on how exactly they capturesequential features remains largely unclear. In this work, we present a study that shows there actually exist some explainable componentsthat reside within the hidden states, which are reminiscent of the classical n-grams features. We evaluated such extracted explainable features from trained RNNs on downstream sentiment analysis tasks and found they could be used to model interesting linguistic phenomena such as negation and intensification. Furthermore, we examined the efficacy of using such n-gram components alone as encoders on tasks such as sentiment analysis and language modeling, revealing they could be playing important roles in contributing to the overall performance of RNNs. We hope our findings could add interpretability to RNN architectures, and also provide inspirations for proposing new architectures for sequential data.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"North American Chapter of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.02724","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP)tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may promptre-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on handling sequential data. Despite manyprior attempts to interpret RNNs, their internal mechanisms have not been fully understood, and the question on how exactly they capturesequential features remains largely unclear. In this work, we present a study that shows there actually exist some explainable componentsthat reside within the hidden states, which are reminiscent of the classical n-grams features. We evaluated such extracted explainable features from trained RNNs on downstream sentiment analysis tasks and found they could be used to model interesting linguistic phenomena such as negation and intensification. Furthermore, we examined the efficacy of using such n-gram components alone as encoders on tasks such as sentiment analysis and language modeling, revealing they could be playing important roles in contributing to the overall performance of RNNs. We hope our findings could add interpretability to RNN architectures, and also provide inspirations for proposing new architectures for sequential data.
由递归引起的隐式n图
尽管基于自关注的模型(如Transformers)在自然语言处理(NLP)任务上取得了显著的成功,但最近的研究表明,它们在序列转换建模方面存在局限性(Hahn, 2020),这可能会促使人们对循环神经网络(rnn)进行检查,后者在处理序列数据方面表现出令人印象深刻的结果。尽管之前有许多解释rnn的尝试,但它们的内部机制尚未完全理解,并且它们如何准确捕获序列特征的问题在很大程度上仍然不清楚。在这项工作中,我们提出了一项研究,表明实际上存在一些可解释的成分,这些成分存在于隐藏状态中,这让人想起经典的n-grams特征。我们在下游情感分析任务中评估了从训练rnn中提取的可解释特征,发现它们可以用来模拟有趣的语言现象,如否定和强化。此外,我们研究了在情感分析和语言建模等任务中单独使用这些n-gram组件作为编码器的功效,揭示了它们可能在促进rnn的整体性能方面发挥重要作用。我们希望我们的发现可以增加RNN架构的可解释性,并为提出序列数据的新架构提供灵感。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信