Probabilistic topic model based approach for detecting bursty events from social media data

Chunshan Li, Dianhui Chu
{"title":"Probabilistic topic model based approach for detecting bursty events from social media data","authors":"Chunshan Li, Dianhui Chu","doi":"10.1109/SPAC.2017.8304365","DOIUrl":null,"url":null,"abstract":"To detect bursty events from the huge amount of real-time data generated from various social networks has attracted more and more research efforts. Most of existing algorithms detect the bursty events either by discovering the co-occurrent bursty words or the emerging topics, ignoring the association between bursty and topics. Meanwhile, these algorithms are not able to cope with short text data like Weibo and Twitter. This paper proposes two novel probabilistic generative models (TBE/TBEP). TBE model can detect bursty events on long articles which can simultaneously consider the co-occurrent relationships among bursty words as well as the co-occurrent relationships among occurrent words and the underlying topics which generate the bursty events. TBEP model captures the assumption: one post are always have the one topic, which can handle the bursty events on Weibo and Twitter. The Gibbs sampling technique is adopted to estimate the model parameters. Extensive experiments are performed on three real data sets and the promising results, compared with the state-of-the-art Hot-Bursty-Event detection algorithms, have demonstrated that the proposed approach can: (1) achieve better model performance with respect to the evaluation criteria; (2) achieve more accurate bursty evnets on long/short text data.","PeriodicalId":161647,"journal":{"name":"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Security, Pattern Analysis, and Cybernetics (SPAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SPAC.2017.8304365","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

To detect bursty events from the huge amount of real-time data generated from various social networks has attracted more and more research efforts. Most of existing algorithms detect the bursty events either by discovering the co-occurrent bursty words or the emerging topics, ignoring the association between bursty and topics. Meanwhile, these algorithms are not able to cope with short text data like Weibo and Twitter. This paper proposes two novel probabilistic generative models (TBE/TBEP). TBE model can detect bursty events on long articles which can simultaneously consider the co-occurrent relationships among bursty words as well as the co-occurrent relationships among occurrent words and the underlying topics which generate the bursty events. TBEP model captures the assumption: one post are always have the one topic, which can handle the bursty events on Weibo and Twitter. The Gibbs sampling technique is adopted to estimate the model parameters. Extensive experiments are performed on three real data sets and the promising results, compared with the state-of-the-art Hot-Bursty-Event detection algorithms, have demonstrated that the proposed approach can: (1) achieve better model performance with respect to the evaluation criteria; (2) achieve more accurate bursty evnets on long/short text data.
基于概率主题模型的社交媒体突发事件检测方法
从各种社交网络产生的海量实时数据中检测突发事件已经引起了越来越多的研究。现有的突发事件检测算法大多是通过发现同时出现的突发词或新出现的主题来检测突发事件,忽略了突发事件与主题之间的关联。同时,这些算法不能处理像微博和推特这样的短文本数据。本文提出了两种新的概率生成模型(TBE/TBEP)。该模型可以同时考虑突发词之间的共现关系以及突发词与产生突发事件的底层主题之间的共现关系,从而实现对长文突发事件的检测。bep模型抓住了这样一个假设:一个帖子总是有一个主题,它可以处理微博和Twitter上的突发事件。采用Gibbs抽样技术对模型参数进行估计。在三个真实数据集上进行了大量的实验,并与最先进的热爆事件检测算法进行了比较,结果表明:(1)就评估标准而言,提出的方法可以获得更好的模型性能;(2)实现更准确的长/短文本数据突发事件。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信