{"title":"More than Extracting \"Important\" Sentences: the Application of PEGASUS","authors":"Ting-Hao Yang, Ching-Ching Lu, Wen-Lian Hsu","doi":"10.1109/taai54685.2021.00032","DOIUrl":null,"url":null,"abstract":"Pre-trained language models may reduce the amount of training data required. Among the models, PEGASUS, a recently proposed self-supervised approach, is trained to generate the pseudo-summary given the partially masked document. PEGASUS uses gap sentence generation for summarization. The most important sentences are masked, and then PEGASUS predicts the masked sentences as the output summary. In this study, however, we apply PEGASUS in a novel downstream task. We reformulate the task to generate the masked question part in a primary math word problem. In the past research, PEGASUS has shown good potentials on the few-shot datasets, so we try a smaller set of primary math text problems as well. The fine-tuning dataset sizes used in this study are 1000, 500, 50, 10 samples. Their performance are measured by a non-weighted average of the ROUGE-1, ROUGE-2, and ROUGE-L scores. The results show the outstanding performance of PEGASUS applied in our novel downstream task.","PeriodicalId":343821,"journal":{"name":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Technologies and Applications of Artificial Intelligence (TAAI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/taai54685.2021.00032","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Pre-trained language models may reduce the amount of training data required. Among the models, PEGASUS, a recently proposed self-supervised approach, is trained to generate the pseudo-summary given the partially masked document. PEGASUS uses gap sentence generation for summarization. The most important sentences are masked, and then PEGASUS predicts the masked sentences as the output summary. In this study, however, we apply PEGASUS in a novel downstream task. We reformulate the task to generate the masked question part in a primary math word problem. In the past research, PEGASUS has shown good potentials on the few-shot datasets, so we try a smaller set of primary math text problems as well. The fine-tuning dataset sizes used in this study are 1000, 500, 50, 10 samples. Their performance are measured by a non-weighted average of the ROUGE-1, ROUGE-2, and ROUGE-L scores. The results show the outstanding performance of PEGASUS applied in our novel downstream task.