Task-specific pre-training improves models for paraphrase generation

O. Skurzhanskyi, O. Marchenko
{"title":"Task-specific pre-training improves models for paraphrase generation","authors":"O. Skurzhanskyi, O. Marchenko","doi":"10.1145/3582768.3582791","DOIUrl":null,"url":null,"abstract":"Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.","PeriodicalId":315721,"journal":{"name":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582768.3582791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.
特定任务的预训练改进了释义生成模型
意译生成是自然语言处理领域一个长期存在的基本问题。随着迁移学习的巨大成功,预训练→微调方法已经成为一种标准选择。与此同时,流行的与任务无关的预训练通常需要千兆字节的数据集和数百个gpu,而可用的预训练模型受到固定架构和大小(即基础,大型)的限制。我们提出了一种简单有效的预训练方法,专门用于释义生成,该方法显著提高了模型质量,并与通用预训练模型的性能相匹配。我们还研究了这个过程如何影响不同体系结构的分数,并表明它适用于所有体系结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信