防止RNN使用序列长度作为特征

Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2022-12-16 DOI:10.1145/3582768.3582776

Jean-Thomas Baillargeon, Hélène Cossette, Luc Lamontagne

{"title":"防止RNN使用序列长度作为特征","authors":"Jean-Thomas Baillargeon, Hélène Cossette, Luc Lamontagne","doi":"10.1145/3582768.3582776","DOIUrl":null,"url":null,"abstract":"Recurrent neural networks are deep learning topologies that can be trained to classify long documents. However, in our recent work, we found a critical problem with these cells: they can use the length differences between texts of different classes as a prominent classification feature. This has the effect of producing models that are brittle and fragile to concept drift, can provide misleading performances and are trivially explainable regardless of text content. This paper illustrates the problem using synthetic and real-world data and provides a simple solution using weight decay regularization.","PeriodicalId":315721,"journal":{"name":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","volume":"37 4","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Preventing RNN from Using Sequence Length as a Feature\",\"authors\":\"Jean-Thomas Baillargeon, Hélène Cossette, Luc Lamontagne\",\"doi\":\"10.1145/3582768.3582776\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recurrent neural networks are deep learning topologies that can be trained to classify long documents. However, in our recent work, we found a critical problem with these cells: they can use the length differences between texts of different classes as a prominent classification feature. This has the effect of producing models that are brittle and fragile to concept drift, can provide misleading performances and are trivially explainable regardless of text content. This paper illustrates the problem using synthetic and real-world data and provides a simple solution using weight decay regularization.\",\"PeriodicalId\":315721,\"journal\":{\"name\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"volume\":\"37 4\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3582768.3582776\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582768.3582776","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

递归神经网络是一种深度学习拓扑，可以训练它对长文档进行分类。然而，在我们最近的工作中，我们发现了这些单元的一个关键问题:它们可以使用不同类别文本之间的长度差异作为一个突出的分类特征。这就产生了易碎的模型，易受概念漂移的影响，可以提供误导性的性能，并且无论文本内容如何都难以解释。本文用合成数据和实际数据说明了这个问题，并提供了一个使用权衰减正则化的简单解决方案。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Preventing RNN from Using Sequence Length as a Feature

Recurrent neural networks are deep learning topologies that can be trained to classify long documents. However, in our recent work, we found a critical problem with these cells: they can use the length differences between texts of different classes as a prominent classification feature. This has the effect of producing models that are brittle and fragile to concept drift, can provide misleading performances and are trivially explainable regardless of text content. This paper illustrates the problem using synthetic and real-world data and provides a simple solution using weight decay regularization.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval

自引率

0.00%

发文量