Extractive Text Summarization System for News Texts

Fahrettin Horasan, Burhan Bilen
{"title":"Extractive Text Summarization System for News Texts","authors":"Fahrettin Horasan, Burhan Bilen","doi":"10.18100/ijamec.800905","DOIUrl":null,"url":null,"abstract":"In today's conditions, it is difficult to obtain information quickly and efficiently due to the size of the data. There are various text documents on the internet and a good extraction algorithm is essential to have the most relevant information from them. Long texts can be boring sometimes. So, readers are eager to get the main idea of the text or any useful information. For this reason, the importance of automatic summarization systems is understood. Text summarization systems can be considered as abstractive summarization or extractive summarization. While abstractive systems produce a summary with new sentences, extractive systems make a selection of sentences from the text used and combine them and present them as a summary. Creating a successful summarization algorithm increases in direct proportion to the success of applying text mining techniques. Text summary systems provide a summary of the text to the user by scoring words and sentences in the main text using various methods and combining high ranked sentences as a result of the process. In this context, many scoring methods have been used. In our study, news data sets are used. The algorithm used is based on extraction and has been evaluated using a task-independent method. After evaluation, the two highest scores taken are ROUGE-1 with 0.68 score and ROUGE-S with 0.54 score. Through all evaluation steps, Precision, Recall and F-Measure values are also specified to see the steps clearly. This is an open access article under the CC BY-SA 4.0 license. (https://creativecommons.org/licenses/by-sa/4.0/)","PeriodicalId":120305,"journal":{"name":"International Journal of Applied Mathematics Electronics and Computers","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Applied Mathematics Electronics and Computers","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18100/ijamec.800905","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

In today's conditions, it is difficult to obtain information quickly and efficiently due to the size of the data. There are various text documents on the internet and a good extraction algorithm is essential to have the most relevant information from them. Long texts can be boring sometimes. So, readers are eager to get the main idea of the text or any useful information. For this reason, the importance of automatic summarization systems is understood. Text summarization systems can be considered as abstractive summarization or extractive summarization. While abstractive systems produce a summary with new sentences, extractive systems make a selection of sentences from the text used and combine them and present them as a summary. Creating a successful summarization algorithm increases in direct proportion to the success of applying text mining techniques. Text summary systems provide a summary of the text to the user by scoring words and sentences in the main text using various methods and combining high ranked sentences as a result of the process. In this context, many scoring methods have been used. In our study, news data sets are used. The algorithm used is based on extraction and has been evaluated using a task-independent method. After evaluation, the two highest scores taken are ROUGE-1 with 0.68 score and ROUGE-S with 0.54 score. Through all evaluation steps, Precision, Recall and F-Measure values are also specified to see the steps clearly. This is an open access article under the CC BY-SA 4.0 license. (https://creativecommons.org/licenses/by-sa/4.0/)
新闻文本抽取摘要系统
在今天的条件下,由于数据的大小,很难快速有效地获取信息。互联网上有各种各样的文本文档,一个好的提取算法对于从中获得最相关的信息至关重要。长文本有时会很无聊。因此,读者渴望得到文章的主旨或任何有用的信息。因此,自动摘要系统的重要性是可以理解的。文本摘要系统可以分为抽象摘要和抽取摘要两种。抽象系统用新句子生成摘要,而抽取系统从使用的文本中选择句子并将它们组合在一起,并将它们作为摘要呈现。创建一个成功的摘要算法与应用文本挖掘技术的成功成正比。文本摘要系统通过使用各种方法对主要文本中的单词和句子进行评分,并将排名较高的句子组合在一起,从而向用户提供文本摘要。在这种情况下,使用了许多评分方法。在我们的研究中,使用了新闻数据集。所使用的算法基于提取,并使用任务独立方法进行了评估。经评价,得分最高的两个是ROUGE-1和ROUGE-S,分别为0.68分和0.54分。通过所有评估步骤,还指定了Precision, Recall和F-Measure值,以便清楚地看到步骤。这是一篇基于CC BY-SA 4.0许可的开放获取文章。(https://creativecommons.org/licenses/by-sa/4.0/)
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信