基于上下文嵌入的小数据文本分类任务迁移学习状态实证分析

F. Carvalho, C. Castro
{"title":"基于上下文嵌入的小数据文本分类任务迁移学习状态实证分析","authors":"F. Carvalho, C. Castro","doi":"10.21528/cbic2019-82","DOIUrl":null,"url":null,"abstract":"Recent developments in the NLP (Natural Language Processing) field have shown that deep transformer based language model architectures trained on a large corpus of unlabeled data are able to transfer knowledge to downstream tasks efficiently through fine-tuning. In particular, BERT and XLNet have shown impressive results, achieving state of the art performance in many tasks through this process. This is partially due to the ability these models have to create better representations of text in the form of contextual embeddings. However not much has been explored in the literature about the robustness of the transfer learning process of these models on a small data scenario. Also not a lot of effort has been put on analysing the behaviour of the two models fine-tuning process with different amounts of training data available. This paper addresses these questions through an empirical evaluation of these models on some datasets when finetuned on progressively smaller fractions of training data, for the task of text classification. It is shown that BERT and XLNet perform well with small data and can achieve good performance with very few labels available, in most cases. Results yielded with varying fractions of training data indicate that few examples are necessary in order to fine-tune the models and, although there is a positive effect in training with more labeled data, using only a subset of data is already enough to achieve a comparable performance with traditional non-deep learning models trained with substantially more data. Also it is noticeable how quickly the transfer learning curve of these methods saturate, reinforcing their ability to perform well with less data available. Keywords—Small data, text classification, NLP, contextual embeddings, representation learning, deep learning","PeriodicalId":160474,"journal":{"name":"Anais do 14. Congresso Brasileiro de Inteligência Computacional","volume":"69 2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Empirical Analysis on the State of Transfer Learning for Small Data Text Classification Tasks Using Contextual Embeddings\",\"authors\":\"F. Carvalho, C. Castro\",\"doi\":\"10.21528/cbic2019-82\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent developments in the NLP (Natural Language Processing) field have shown that deep transformer based language model architectures trained on a large corpus of unlabeled data are able to transfer knowledge to downstream tasks efficiently through fine-tuning. In particular, BERT and XLNet have shown impressive results, achieving state of the art performance in many tasks through this process. This is partially due to the ability these models have to create better representations of text in the form of contextual embeddings. However not much has been explored in the literature about the robustness of the transfer learning process of these models on a small data scenario. Also not a lot of effort has been put on analysing the behaviour of the two models fine-tuning process with different amounts of training data available. This paper addresses these questions through an empirical evaluation of these models on some datasets when finetuned on progressively smaller fractions of training data, for the task of text classification. It is shown that BERT and XLNet perform well with small data and can achieve good performance with very few labels available, in most cases. Results yielded with varying fractions of training data indicate that few examples are necessary in order to fine-tune the models and, although there is a positive effect in training with more labeled data, using only a subset of data is already enough to achieve a comparable performance with traditional non-deep learning models trained with substantially more data. Also it is noticeable how quickly the transfer learning curve of these methods saturate, reinforcing their ability to perform well with less data available. Keywords—Small data, text classification, NLP, contextual embeddings, representation learning, deep learning\",\"PeriodicalId\":160474,\"journal\":{\"name\":\"Anais do 14. Congresso Brasileiro de Inteligência Computacional\",\"volume\":\"69 2\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Anais do 14. Congresso Brasileiro de Inteligência Computacional\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.21528/cbic2019-82\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do 14. Congresso Brasileiro de Inteligência Computacional","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.21528/cbic2019-82","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

NLP(自然语言处理)领域的最新发展表明,在大量未标记数据的语料库上训练的基于深度转换器的语言模型体系结构能够通过微调有效地将知识转移到下游任务。特别是,BERT和XLNet已经显示出令人印象深刻的结果,通过这个过程在许多任务中实现了最先进的性能。这部分是由于这些模型有能力以上下文嵌入的形式创建更好的文本表示。然而,文献中关于这些模型在小数据场景下迁移学习过程的鲁棒性的探讨并不多。此外,在使用不同数量的可用训练数据的情况下,分析两种模型微调过程的行为也没有付出很多努力。本文通过对这些模型在一些数据集上的经验评估来解决这些问题,当这些模型在逐渐减少的训练数据部分上进行微调时,用于文本分类任务。结果表明,BERT和XLNet在处理小数据时表现良好,并且在大多数情况下,在可用标签很少的情况下也能获得良好的性能。使用不同比例的训练数据产生的结果表明,为了对模型进行微调,需要的示例很少,尽管使用更多标记数据进行训练具有积极效果,但仅使用数据子集就足以实现与使用更多数据训练的传统非深度学习模型相当的性能。同样值得注意的是,这些方法的迁移学习曲线饱和的速度有多快,这增强了它们在可用数据较少的情况下表现良好的能力。关键词:小数据,文本分类,自然语言处理,上下文嵌入,表示学习,深度学习
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Empirical Analysis on the State of Transfer Learning for Small Data Text Classification Tasks Using Contextual Embeddings
Recent developments in the NLP (Natural Language Processing) field have shown that deep transformer based language model architectures trained on a large corpus of unlabeled data are able to transfer knowledge to downstream tasks efficiently through fine-tuning. In particular, BERT and XLNet have shown impressive results, achieving state of the art performance in many tasks through this process. This is partially due to the ability these models have to create better representations of text in the form of contextual embeddings. However not much has been explored in the literature about the robustness of the transfer learning process of these models on a small data scenario. Also not a lot of effort has been put on analysing the behaviour of the two models fine-tuning process with different amounts of training data available. This paper addresses these questions through an empirical evaluation of these models on some datasets when finetuned on progressively smaller fractions of training data, for the task of text classification. It is shown that BERT and XLNet perform well with small data and can achieve good performance with very few labels available, in most cases. Results yielded with varying fractions of training data indicate that few examples are necessary in order to fine-tune the models and, although there is a positive effect in training with more labeled data, using only a subset of data is already enough to achieve a comparable performance with traditional non-deep learning models trained with substantially more data. Also it is noticeable how quickly the transfer learning curve of these methods saturate, reinforcing their ability to perform well with less data available. Keywords—Small data, text classification, NLP, contextual embeddings, representation learning, deep learning
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信