Hot Topic Detection on Newspaper

T. Cao, Tat-Huy Tran, Thanh-Thuy Luu
{"title":"Hot Topic Detection on Newspaper","authors":"T. Cao, Tat-Huy Tran, Thanh-Thuy Luu","doi":"10.1145/3287921.3287965","DOIUrl":null,"url":null,"abstract":"Online newspaper nowadays is gradually replacing the traditional one and the variety of articles on newspaper motivated the need for capturing hot topics to give Internet users a shortcut to the hot news. A hot topic always reflects the people's concern in real life and has big impact not only on community but also in business. In this paper, we proposed a novel topic detection approach by applying Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) on Vector Space Model (VSM) to solve the challenge in noisy data and Pearson product-moment correlation coefficient (PMCC) on high ranking keywords to identify topics behind keywords. The proposed approach is evaluated over a dataset of ten thousand of articles and the experimental results are competitive in term of precision with other state-of-the-art methods.","PeriodicalId":448008,"journal":{"name":"Proceedings of the 9th International Symposium on Information and Communication Technology","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 9th International Symposium on Information and Communication Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3287921.3287965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Online newspaper nowadays is gradually replacing the traditional one and the variety of articles on newspaper motivated the need for capturing hot topics to give Internet users a shortcut to the hot news. A hot topic always reflects the people's concern in real life and has big impact not only on community but also in business. In this paper, we proposed a novel topic detection approach by applying Hierarchical Density-Based Spatial Clustering of Applications with Noise (HDBSCAN) on Vector Space Model (VSM) to solve the challenge in noisy data and Pearson product-moment correlation coefficient (PMCC) on high ranking keywords to identify topics behind keywords. The proposed approach is evaluated over a dataset of ten thousand of articles and the experimental results are competitive in term of precision with other state-of-the-art methods.
报纸热点话题检测
如今,网络报纸正在逐渐取代传统报纸,报纸上各种各样的文章激发了捕捉热点话题的需求,为网民提供了一条获取热点新闻的捷径。一个热点话题总是反映了人们在现实生活中的关注,不仅对社会有很大的影响,对商业也有很大的影响。在本文中,我们提出了一种新的主题检测方法,通过在向量空间模型(VSM)上应用基于层次密度的带噪声应用空间聚类(HDBSCAN)来解决噪声数据中的挑战,并在高排名关键词上使用Pearson积差相关系数(PMCC)来识别关键词背后的主题。该方法在一万篇文章的数据集上进行了评估,实验结果在精度方面与其他最先进的方法具有竞争力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信