基于BART和SVM的新闻聚合系统的自动文本摘要和主题检测

Farrel Octavianus, Albert Wihardi, Muhamad Keenan Ario, Derwin Suhartono
{"title":"基于BART和SVM的新闻聚合系统的自动文本摘要和主题检测","authors":"Farrel Octavianus, Albert Wihardi, Muhamad Keenan Ario, Derwin Suhartono","doi":"10.1109/ISITDI55734.2022.9944521","DOIUrl":null,"url":null,"abstract":"With a large amount of news consumed by the public, it is impossible to digest all the available news. This paper developed an automated text summarization and topic detection algorithm for news articles, allowing the public to read summarized news without losing the essential points of the news. The algorithm will then be used to build and develop a system that has news aggregation technology. First, the system will scrape news articles from various sources, then topic detection and text summarization will be applied to each article before finally being displayed. The methodology used in this research can be divided into data gathering, topic detection, text summarization, and system development. The result of this research shows that the Support Vector Machine performed exceptionally well in topic detection tasks, better than other supervised learning algorithms used in this research, whereas Bidirectional and Auto-Regressive Transformer (BART) with the appropriate parameters performed relatively well in text summarization. To conclude, topic detection and automated text summarization can both be combined and used to develop a news aggregation system, with Support Vector Machine and BART both performing well in their respective tasks.","PeriodicalId":312644,"journal":{"name":"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)","volume":"202 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated Text Summarization and Topic Detection on News Aggregation System Using BART and SVM\",\"authors\":\"Farrel Octavianus, Albert Wihardi, Muhamad Keenan Ario, Derwin Suhartono\",\"doi\":\"10.1109/ISITDI55734.2022.9944521\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"With a large amount of news consumed by the public, it is impossible to digest all the available news. This paper developed an automated text summarization and topic detection algorithm for news articles, allowing the public to read summarized news without losing the essential points of the news. The algorithm will then be used to build and develop a system that has news aggregation technology. First, the system will scrape news articles from various sources, then topic detection and text summarization will be applied to each article before finally being displayed. The methodology used in this research can be divided into data gathering, topic detection, text summarization, and system development. The result of this research shows that the Support Vector Machine performed exceptionally well in topic detection tasks, better than other supervised learning algorithms used in this research, whereas Bidirectional and Auto-Regressive Transformer (BART) with the appropriate parameters performed relatively well in text summarization. To conclude, topic detection and automated text summarization can both be combined and used to develop a news aggregation system, with Support Vector Machine and BART both performing well in their respective tasks.\",\"PeriodicalId\":312644,\"journal\":{\"name\":\"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)\",\"volume\":\"202 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISITDI55734.2022.9944521\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 International Symposium on Information Technology and Digital Innovation (ISITDI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISITDI55734.2022.9944521","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于公众消费了大量的新闻,因此不可能消化所有可用的新闻。本文开发了一种新闻文章的自动文本摘要和主题检测算法,使公众能够在不丢失新闻要点的情况下阅读摘要新闻。然后,该算法将用于构建和开发具有新闻聚合技术的系统。首先,系统会从各种来源抓取新闻文章,然后对每篇文章进行主题检测和文本摘要,最后显示出来。本研究使用的方法可分为数据收集、主题检测、文本摘要和系统开发。研究结果表明,支持向量机在主题检测任务中表现出色,优于本研究中使用的其他监督学习算法,而具有适当参数的双向和自回归变压器(BART)在文本摘要中表现相对较好。综上所述,主题检测和自动文本摘要都可以结合起来用于开发新闻聚合系统,支持向量机和BART在各自的任务中都表现良好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Automated Text Summarization and Topic Detection on News Aggregation System Using BART and SVM
With a large amount of news consumed by the public, it is impossible to digest all the available news. This paper developed an automated text summarization and topic detection algorithm for news articles, allowing the public to read summarized news without losing the essential points of the news. The algorithm will then be used to build and develop a system that has news aggregation technology. First, the system will scrape news articles from various sources, then topic detection and text summarization will be applied to each article before finally being displayed. The methodology used in this research can be divided into data gathering, topic detection, text summarization, and system development. The result of this research shows that the Support Vector Machine performed exceptionally well in topic detection tasks, better than other supervised learning algorithms used in this research, whereas Bidirectional and Auto-Regressive Transformer (BART) with the appropriate parameters performed relatively well in text summarization. To conclude, topic detection and automated text summarization can both be combined and used to develop a news aggregation system, with Support Vector Machine and BART both performing well in their respective tasks.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信