特定任务的预训练改进了释义生成模型

Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval Pub Date : 2022-12-16 DOI:10.1145/3582768.3582791

O. Skurzhanskyi, O. Marchenko

{"title":"特定任务的预训练改进了释义生成模型","authors":"O. Skurzhanskyi, O. Marchenko","doi":"10.1145/3582768.3582791","DOIUrl":null,"url":null,"abstract":"Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.","PeriodicalId":315721,"journal":{"name":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Task-specific pre-training improves models for paraphrase generation\",\"authors\":\"O. Skurzhanskyi, O. Marchenko\",\"doi\":\"10.1145/3582768.3582791\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.\",\"PeriodicalId\":315721,\"journal\":{\"name\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3582768.3582791\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3582768.3582791","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

意译生成是自然语言处理领域一个长期存在的基本问题。随着迁移学习的巨大成功，预训练→微调方法已经成为一种标准选择。与此同时，流行的与任务无关的预训练通常需要千兆字节的数据集和数百个gpu，而可用的预训练模型受到固定架构和大小(即基础，大型)的限制。我们提出了一种简单有效的预训练方法，专门用于释义生成，该方法显著提高了模型质量，并与通用预训练模型的性能相匹配。我们还研究了这个过程如何影响不同体系结构的分数，并表明它适用于所有体系结构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Task-specific pre-training improves models for paraphrase generation

Paraphrase generation is a fundamental and longstanding problem in the Natural Language Processing field. With the huge success of transfer learning, the pre-train → fine-tune approach has become a standard choice. At the same time, popular task-agnostic pre-trainings usually require gigabyte datasets and hundreds of GPUs, while available pre-trained models are limited by fixed architecture and size (i.e. base, large). We propose a simple and efficient pre-training approach specifically for paraphrase generation, which noticeably boosts model quality and matches the performance of general-purpose pre-trained models. We also investigate how this procedure influences the scores across different architectures and show that it works for all of them.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2022 6th International Conference on Natural Language Processing and Information Retrieval

自引率

0.00%

发文量