基于变换器模型的新闻文章文本摘要比较研究

International journal of advanced trends in computer science and engineering Pub Date : 2024-04-10 DOI:10.30534/ijatcse/2024/011322024

{"title":"基于变换器模型的新闻文章文本摘要比较研究","authors":"","doi":"10.30534/ijatcse/2024/011322024","DOIUrl":null,"url":null,"abstract":"Transformer-based models such as GPT, T5, BART, and PEGASUS have made substantial progress in text summarization, a sub-domain of natural language processing that entails extracting important information from lengthy texts. The main objective of this research was to conduct a comparative analysis of these four transformer-based models based on their performance in text summarization of news articles. In achieving this objective, the transformer models pre-trained on extensive datasets were fine-tuned on the CNN/DailyMail dataset using a low learning rate to preserve the learned representations. The T5 transformer records the highest scores of 35.12, 22.75, 32.82, and 28.59 in ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-Lsum respectively, surpassing GPT, BART, and PEGASUS across all ROUGE metrics. The findings deduced from this study establish the proficiency of encoder-decoder models such as T5 in summary generation. Furthermore, the findings also demonstrated that the fine-tuning process's effectiveness in pre-trained models is improved when the pre-training objective closely aligns with the downstream task.","PeriodicalId":483282,"journal":{"name":"International journal of advanced trends in computer science and engineering","volume":"26 2 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-04-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparative Study of Transformer-based Models for Text Summarization of News Articles\",\"authors\":\"\",\"doi\":\"10.30534/ijatcse/2024/011322024\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Transformer-based models such as GPT, T5, BART, and PEGASUS have made substantial progress in text summarization, a sub-domain of natural language processing that entails extracting important information from lengthy texts. The main objective of this research was to conduct a comparative analysis of these four transformer-based models based on their performance in text summarization of news articles. In achieving this objective, the transformer models pre-trained on extensive datasets were fine-tuned on the CNN/DailyMail dataset using a low learning rate to preserve the learned representations. The T5 transformer records the highest scores of 35.12, 22.75, 32.82, and 28.59 in ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-Lsum respectively, surpassing GPT, BART, and PEGASUS across all ROUGE metrics. The findings deduced from this study establish the proficiency of encoder-decoder models such as T5 in summary generation. Furthermore, the findings also demonstrated that the fine-tuning process's effectiveness in pre-trained models is improved when the pre-training objective closely aligns with the downstream task.\",\"PeriodicalId\":483282,\"journal\":{\"name\":\"International journal of advanced trends in computer science and engineering\",\"volume\":\"26 2 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-04-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of advanced trends in computer science and engineering\",\"FirstCategoryId\":\"0\",\"ListUrlMain\":\"https://doi.org/10.30534/ijatcse/2024/011322024\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of advanced trends in computer science and engineering","FirstCategoryId":"0","ListUrlMain":"https://doi.org/10.30534/ijatcse/2024/011322024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

GPT、T5、BART 和 PEGASUS 等基于变换器的模型在文本摘要方面取得了长足的进步，文本摘要是自然语言处理的一个子领域，需要从冗长的文本中提取重要信息。本研究的主要目的是根据这四种基于转换器的模型在新闻文章文本摘要中的表现，对它们进行比较分析。为实现这一目标，我们在 CNN/DailyMail 数据集上使用低学习率对在大量数据集上预先训练的转换器模型进行了微调，以保留所学的表征。T5 变换器在 ROUGE-1、ROUGE-2、ROUGE-L 和 ROUGE-Lsum 中分别获得了 35.12、22.75、32.82 和 28.59 的最高分，在所有 ROUGE 指标上都超过了 GPT、BART 和 PEGASUS。本研究的结论证明了 T5 等编码器-解码器模型在摘要生成方面的能力。此外，研究结果还表明，当预训练目标与下游任务密切相关时，微调过程在预训练模型中的有效性会得到提高。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Comparative Study of Transformer-based Models for Text Summarization of News Articles

Transformer-based models such as GPT, T5, BART, and PEGASUS have made substantial progress in text summarization, a sub-domain of natural language processing that entails extracting important information from lengthy texts. The main objective of this research was to conduct a comparative analysis of these four transformer-based models based on their performance in text summarization of news articles. In achieving this objective, the transformer models pre-trained on extensive datasets were fine-tuned on the CNN/DailyMail dataset using a low learning rate to preserve the learned representations. The T5 transformer records the highest scores of 35.12, 22.75, 32.82, and 28.59 in ROUGE-1, ROUGE-2, ROUGE-L, and ROUGE-Lsum respectively, surpassing GPT, BART, and PEGASUS across all ROUGE metrics. The findings deduced from this study establish the proficiency of encoder-decoder models such as T5 in summary generation. Furthermore, the findings also demonstrated that the fine-tuning process's effectiveness in pre-trained models is improved when the pre-training objective closely aligns with the downstream task.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International journal of advanced trends in computer science and engineering

自引率

0.00%

发文量