Automatic Title Generation for Learning Resources and Pathways with Pre-trained Transformer Models

Int. J. Semantic Comput. Pub Date : 2021-12-01 DOI:10.1142/s1793351x21400134

Prakhar Mishra, Chaitali Diwan, S. Srinivasa, G. Srinivasaraghavan

{"title":"Automatic Title Generation for Learning Resources and Pathways with Pre-trained Transformer Models","authors":"Prakhar Mishra, Chaitali Diwan, S. Srinivasa, G. Srinivasaraghavan","doi":"10.1142/s1793351x21400134","DOIUrl":null,"url":null,"abstract":"To create curiosity and interest for a topic in online learning is a challenging task. A good preview that outlines the contents of a learning pathway could help learners know the topic and get interested in it. Towards this end, we propose a hierarchical title generation approach to generate semantically relevant titles for the learning resources in a learning pathway and a title for the pathway itself. Our approach to Automatic Title Generation for a given text is based on pre-trained Transformer Language Model GPT-2. A pool of candidate titles are generated and an appropriate title is selected among them which is then refined or de-noised to get the final title. The model is trained on research paper abstracts from arXiv and evaluated on three different test sets. We show that it generates semantically and syntactically relevant titles as reflected in ROUGE, BLEU scores and human evaluations. We propose an optional abstractive Summarizer module based on pre-trained Transformer model T5 to shorten medium length documents. This module is also trained and evaluated on research papers from arXiv dataset. Finally, we show that the proposed model of hierarchical title generation for learning pathways has promising results.","PeriodicalId":217956,"journal":{"name":"Int. J. Semantic Comput.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Semantic Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1142/s1793351x21400134","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

To create curiosity and interest for a topic in online learning is a challenging task. A good preview that outlines the contents of a learning pathway could help learners know the topic and get interested in it. Towards this end, we propose a hierarchical title generation approach to generate semantically relevant titles for the learning resources in a learning pathway and a title for the pathway itself. Our approach to Automatic Title Generation for a given text is based on pre-trained Transformer Language Model GPT-2. A pool of candidate titles are generated and an appropriate title is selected among them which is then refined or de-noised to get the final title. The model is trained on research paper abstracts from arXiv and evaluated on three different test sets. We show that it generates semantically and syntactically relevant titles as reflected in ROUGE, BLEU scores and human evaluations. We propose an optional abstractive Summarizer module based on pre-trained Transformer model T5 to shorten medium length documents. This module is also trained and evaluated on research papers from arXiv dataset. Finally, we show that the proposed model of hierarchical title generation for learning pathways has promising results.

查看原文本刊更多论文

使用预训练变压器模型的学习资源和路径的自动标题生成

在在线学习中，对一个话题产生好奇心和兴趣是一项具有挑战性的任务。一个好的预览，勾勒出学习路径的内容，可以帮助学习者了解主题，并对其产生兴趣。为此，我们提出了一种分层标题生成方法，为学习路径中的学习资源和路径本身生成语义相关的标题。我们为给定文本自动生成标题的方法是基于预训练的转换语言模型GPT-2。生成候选标题池，并从中选择合适的标题，然后对其进行细化或去噪以获得最终标题。该模型在arXiv的研究论文摘要上进行训练，并在三个不同的测试集上进行评估。我们表明，它生成语义和语法相关的标题，反映在ROUGE, BLEU分数和人类评价。我们提出了一个基于预训练的Transformer模型T5的可选抽象Summarizer模块来缩短中等长度的文档。该模块还对来自arXiv数据集的研究论文进行了训练和评估。最后，我们证明了所提出的学习路径分层标题生成模型具有良好的效果。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Int. J. Semantic Comput.

自引率

0.00%

发文量