特定任务的预训练改进了释义生成模型

O. Skurzhanskyi, O. Marchenko
{"title":"特定任务的预训练改进了释义生成模型","authors":"O. Skurzhanskyi, O. Marchenko","doi":"10.1145/3582768.3582791","DOIUrl":null,"url":null,"abstract":"Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.","PeriodicalId":315721,"journal":{"name":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Task-specific pre-training improves models for paraphrase generation\",\"authors\":\"O. Skurzhanskyi, O. Marchenko\",\"doi\":\"10.1145/3582768.3582791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.\",\"PeriodicalId\":315721,\"journal\":{\"name\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3582768.3582791\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582768.3582791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

意译生成是自然语言处理领域一个长期存在的基本问题。随着迁移学习的巨大成功,预训练→微调方法已经成为一种标准选择。与此同时,流行的与任务无关的预训练通常需要千兆字节的数据集和数百个gpu,而可用的预训练模型受到固定架构和大小(即基础,大型)的限制。我们提出了一种简单有效的预训练方法,专门用于释义生成,该方法显著提高了模型质量,并与通用预训练模型的性能相匹配。我们还研究了这个过程如何影响不同体系结构的分数,并表明它适用于所有体系结构。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Task-specific pre-training improves models for paraphrase generation
Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信