使用BertSum和指针生成器网络的单文档摘要

Q2 Engineering
Rini Wijayanti, M. L. Khodra, D. H. Widyantoro
{"title":"使用BertSum和指针生成器网络的单文档摘要","authors":"Rini Wijayanti, M. L. Khodra, D. H. Widyantoro","doi":"10.15676/ijeei.2021.13.4.10","DOIUrl":null,"url":null,"abstract":": The rapid development of textual data requires an automated text summarization system to obtain shortened versions of documents quickly and accurately. This paper investigates the performances of BertSum and Pointer Generator Network (PGN) on the IndoSum corpus containing Indonesian news articles. We compare these methods to NeuralSum, which is claimed to outperform other methods when working with the IndoSum dataset. In our experiment, BertSum with Indonesian's pre-trained model outperformed NeuralSum in extractive summarization. NeuralSum, on the other hand, tends to select the leading sentences as a summary and occasionally produces a blank summary. Meanwhile, PGN effectively prevents word repetition by using a coverage mechanism, although the summary results are sometimes out of context.","PeriodicalId":38705,"journal":{"name":"International Journal on Electrical Engineering and Informatics","volume":"47 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Single Document Summarization Using BertSum and Pointer Generator Network\",\"authors\":\"Rini Wijayanti, M. L. Khodra, D. H. Widyantoro\",\"doi\":\"10.15676/ijeei.2021.13.4.10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": The rapid development of textual data requires an automated text summarization system to obtain shortened versions of documents quickly and accurately. This paper investigates the performances of BertSum and Pointer Generator Network (PGN) on the IndoSum corpus containing Indonesian news articles. We compare these methods to NeuralSum, which is claimed to outperform other methods when working with the IndoSum dataset. In our experiment, BertSum with Indonesian's pre-trained model outperformed NeuralSum in extractive summarization. NeuralSum, on the other hand, tends to select the leading sentences as a summary and occasionally produces a blank summary. Meanwhile, PGN effectively prevents word repetition by using a coverage mechanism, although the summary results are sometimes out of context.\",\"PeriodicalId\":38705,\"journal\":{\"name\":\"International Journal on Electrical Engineering and Informatics\",\"volume\":\"47 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal on Electrical Engineering and Informatics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.15676/ijeei.2021.13.4.10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Electrical Engineering and Informatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15676/ijeei.2021.13.4.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 2

摘要

文本数据的快速发展需要一个自动文本摘要系统来快速准确地获取文档的缩短版本。本文研究了BertSum和指针生成器网络(PGN)在包含印度尼西亚新闻文章的IndoSum语料库上的性能。我们将这些方法与NeuralSum进行比较,据称NeuralSum在处理IndoSum数据集时优于其他方法。在我们的实验中,BertSum使用印度尼西亚的预训练模型在提取摘要方面优于NeuralSum。另一方面,NeuralSum倾向于选择引子句作为摘要,偶尔会产生空白摘要。同时,PGN通过覆盖机制有效地防止单词重复,尽管摘要结果有时会断章取义。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Single Document Summarization Using BertSum and Pointer Generator Network
: The rapid development of textual data requires an automated text summarization system to obtain shortened versions of documents quickly and accurately. This paper investigates the performances of BertSum and Pointer Generator Network (PGN) on the IndoSum corpus containing Indonesian news articles. We compare these methods to NeuralSum, which is claimed to outperform other methods when working with the IndoSum dataset. In our experiment, BertSum with Indonesian's pre-trained model outperformed NeuralSum in extractive summarization. NeuralSum, on the other hand, tends to select the leading sentences as a summary and occasionally produces a blank summary. Meanwhile, PGN effectively prevents word repetition by using a coverage mechanism, although the summary results are sometimes out of context.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
2.70
自引率
0.00%
发文量
31
审稿时长
20 weeks
期刊介绍: International Journal on Electrical Engineering and Informatics is a peer reviewed journal in the field of electrical engineering and informatics. The journal is published quarterly by The School of Electrical Engineering and Informatics, Institut Teknologi Bandung, Indonesia. All papers will be blind reviewed. Accepted papers will be available on line (free access) and printed version. No publication fee. The journal publishes original papers in the field of electrical engineering and informatics which covers, but not limited to, the following scope : Power Engineering Electric Power Generation, Transmission and Distribution, Power Electronics, Power Quality, Power Economic, FACTS, Renewable Energy, Electric Traction, Electromagnetic Compatibility, Electrical Engineering Materials, High Voltage Insulation Technologies, High Voltage Apparatuses, Lightning Detection and Protection, Power System Analysis, SCADA, Electrical Measurements Telecommunication Engineering Antenna and Wave Propagation, Modulation and Signal Processing for Telecommunication, Wireless and Mobile Communications, Information Theory and Coding, Communication Electronics and Microwave, Radar Imaging, Distributed Platform, Communication Network and Systems, Telematics Services, Security Network, and Radio Communication. Computer Engineering Computer Architecture, Parallel and Distributed Computer, Pervasive Computing, Computer Network, Embedded System, Human—Computer Interaction, Virtual/Augmented Reality, Computer Security, VLSI Design-Network Traffic Modeling, Performance Modeling, Dependable Computing, High Performance Computing, Computer Security.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信