Abstractive Text Summarization using Pre-Trained Language Model "Text-to-Text Transfer Transformer (T5)"

Qurrota A’yuna Itsnaini, Mardhiya Hayaty, Andriyan Dwi Putra, N. Jabari
{"title":"Abstractive Text Summarization using Pre-Trained Language Model \"Text-to-Text Transfer Transformer (T5)\"","authors":"Qurrota A’yuna Itsnaini, Mardhiya Hayaty, Andriyan Dwi Putra, N. Jabari","doi":"10.33096/ilkom.v15i1.1532.124-131","DOIUrl":null,"url":null,"abstract":"Automatic Text Summarization (ATS) is one of the utilizations of technological sophistication in terms of text processing assisting humans in producing a summary or key points of a document in large quantities. We use Indonesian language as objects because there are few resources in NLP research using Indonesian language. This paper utilized PLTMs (Pre-Trained Language Models) from the transformer architecture, namely T5 (Text-to-Text Transfer Transformer) which has been completed previously with a larger dataset. Evaluation in this study was measured through comparison of the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) calculation results between the reference summary and the model summary. The experiments with the pre-trained t5-base model with fine tuning parameters of 220M for the Indonesian news dataset yielded relatively high ROUGE values, namely ROUGE-1 = 0.68, ROUGE-2 = 0.61, and ROUGE-L = 0.65. The evaluation value worked well, but the resulting model has not achieved satisfactory results because in terms of abstraction, the model did not work optimally. We also found several errors in the reference summary in the dataset used.","PeriodicalId":33690,"journal":{"name":"Ilkom Jurnal Ilmiah","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Ilkom Jurnal Ilmiah","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33096/ilkom.v15i1.1532.124-131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Automatic Text Summarization (ATS) is one of the utilizations of technological sophistication in terms of text processing assisting humans in producing a summary or key points of a document in large quantities. We use Indonesian language as objects because there are few resources in NLP research using Indonesian language. This paper utilized PLTMs (Pre-Trained Language Models) from the transformer architecture, namely T5 (Text-to-Text Transfer Transformer) which has been completed previously with a larger dataset. Evaluation in this study was measured through comparison of the ROUGE (Recall-Oriented Understudy for Gisting Evaluation) calculation results between the reference summary and the model summary. The experiments with the pre-trained t5-base model with fine tuning parameters of 220M for the Indonesian news dataset yielded relatively high ROUGE values, namely ROUGE-1 = 0.68, ROUGE-2 = 0.61, and ROUGE-L = 0.65. The evaluation value worked well, but the resulting model has not achieved satisfactory results because in terms of abstraction, the model did not work optimally. We also found several errors in the reference summary in the dataset used.
基于预训练语言模型“文本到文本转换转换器(T5)”的抽象文本摘要
自动文本摘要(Automatic Text Summarization, ATS)是在文本处理方面利用复杂的技术来帮助人类大量生成文档的摘要或要点的一种方法。我们之所以选择印尼语作为研究对象,是因为目前使用印尼语的自然语言处理研究资源很少。本文利用了来自转换器架构的pltm(预训练语言模型),即T5(文本到文本传输转换器),该转换器之前已经完成了一个更大的数据集。本研究的评价是通过比较参考总结和模型总结的ROUGE (Recall-Oriented Understudy for Gisting Evaluation)计算结果来衡量的。对印尼新闻数据集使用预训练的t5基模型和微调参数为220M的实验得到了较高的ROUGE值,即ROUGE-1 = 0.68, ROUGE-2 = 0.61, ROUGE- l = 0.65。评价值工作得很好,但由于在抽象方面,模型没有达到最优的效果,所以最终的模型并没有取得令人满意的结果。我们还在使用的数据集中发现了参考摘要中的几个错误。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
审稿时长
4 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信