Implicit n-grams Induced by Recurrence

North American Chapter of the Association for Computational Linguistics Pub Date : 2022-05-05 DOI:10.48550/arXiv.2205.02724

Xiaobing Sun, Wei Lu

{"title":"Implicit n-grams Induced by Recurrence","authors":"Xiaobing Sun, Wei Lu","doi":"10.48550/arXiv.2205.02724","DOIUrl":null,"url":null,"abstract":"Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP)tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may promptre-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on handling sequential data. Despite manyprior attempts to interpret RNNs, their internal mechanisms have not been fully understood, and the question on how exactly they capturesequential features remains largely unclear. In this work, we present a study that shows there actually exist some explainable componentsthat reside within the hidden states, which are reminiscent of the classical n-grams features. We evaluated such extracted explainable features from trained RNNs on downstream sentiment analysis tasks and found they could be used to model interesting linguistic phenomena such as negation and intensification. Furthermore, we examined the efficacy of using such n-gram components alone as encoders on tasks such as sentiment analysis and language modeling, revealing they could be playing important roles in contributing to the overall performance of RNNs. We hope our findings could add interpretability to RNN architectures, and also provide inspirations for proposing new architectures for sequential data.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"North American Chapter of the Association for Computational Linguistics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2205.02724","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Although self-attention based models such as Transformers have achieved remarkable successes on natural language processing (NLP)tasks, recent studies reveal that they have limitations on modeling sequential transformations (Hahn, 2020), which may promptre-examinations of recurrent neural networks (RNNs) that demonstrated impressive results on handling sequential data. Despite manyprior attempts to interpret RNNs, their internal mechanisms have not been fully understood, and the question on how exactly they capturesequential features remains largely unclear. In this work, we present a study that shows there actually exist some explainable componentsthat reside within the hidden states, which are reminiscent of the classical n-grams features. We evaluated such extracted explainable features from trained RNNs on downstream sentiment analysis tasks and found they could be used to model interesting linguistic phenomena such as negation and intensification. Furthermore, we examined the efficacy of using such n-gram components alone as encoders on tasks such as sentiment analysis and language modeling, revealing they could be playing important roles in contributing to the overall performance of RNNs. We hope our findings could add interpretability to RNN architectures, and also provide inspirations for proposing new architectures for sequential data.

查看原文本刊更多论文

由递归引起的隐式n图

尽管基于自关注的模型(如Transformers)在自然语言处理(NLP)任务上取得了显著的成功，但最近的研究表明，它们在序列转换建模方面存在局限性(Hahn, 2020)，这可能会促使人们对循环神经网络(rnn)进行检查，后者在处理序列数据方面表现出令人印象深刻的结果。尽管之前有许多解释rnn的尝试，但它们的内部机制尚未完全理解，并且它们如何准确捕获序列特征的问题在很大程度上仍然不清楚。在这项工作中，我们提出了一项研究，表明实际上存在一些可解释的成分，这些成分存在于隐藏状态中，这让人想起经典的n-grams特征。我们在下游情感分析任务中评估了从训练rnn中提取的可解释特征，发现它们可以用来模拟有趣的语言现象，如否定和强化。此外，我们研究了在情感分析和语言建模等任务中单独使用这些n-gram组件作为编码器的功效，揭示了它们可能在促进rnn的整体性能方面发挥重要作用。我们希望我们的发现可以增加RNN架构的可解释性，并为提出序列数据的新架构提供灵感。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

North American Chapter of the Association for Computational Linguistics

自引率

0.00%

发文量