Event detection from Twitter data

Jagrati Singh, Ishneet Kaur, Anil Kumar Singh
{"title":"Event detection from Twitter data","authors":"Jagrati Singh, Ishneet Kaur, Anil Kumar Singh","doi":"10.1109/ISCON47742.2019.9036286","DOIUrl":null,"url":null,"abstract":"Event detection from Twitter is important for people to extract valuable information about real world events. Automation of this task is challenging due to short and noisy nature of microblogging data. Topic modeling algorithms such as Latent Dirichlet Allocation (LDA) is the most popular algorithm to extract topics from news articles but not suitable for microblogging content due to the data sparsity problem. In this paper, we proposed a method to handle data sparsity problem that makes LDA topic model suitable for Twitter data by considering super tweet (aggregation of similar tweets) as a document instead of single tweet without modifying internal structure of model. Extensive experiments on real-time twitter data show that our approach outperforms the baseline approaches.","PeriodicalId":124412,"journal":{"name":"2019 4th International Conference on Information Systems and Computer Networks (ISCON)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 4th International Conference on Information Systems and Computer Networks (ISCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCON47742.2019.9036286","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Event detection from Twitter is important for people to extract valuable information about real world events. Automation of this task is challenging due to short and noisy nature of microblogging data. Topic modeling algorithms such as Latent Dirichlet Allocation (LDA) is the most popular algorithm to extract topics from news articles but not suitable for microblogging content due to the data sparsity problem. In this paper, we proposed a method to handle data sparsity problem that makes LDA topic model suitable for Twitter data by considering super tweet (aggregation of similar tweets) as a document instead of single tweet without modifying internal structure of model. Extensive experiments on real-time twitter data show that our approach outperforms the baseline approaches.
从Twitter数据进行事件检测
从Twitter中检测事件对于人们提取有关现实世界事件的有价值信息非常重要。由于微博客数据的短而嘈杂的特性,这项任务的自动化具有挑战性。潜在狄利克雷分配(Latent Dirichlet Allocation, LDA)等主题建模算法是从新闻文章中提取主题的最常用算法,但由于数据稀疏性问题,不适合微博内容。本文提出了一种处理数据稀疏性问题的方法,在不修改模型内部结构的情况下,将super tweet(相似tweet的聚合)视为文档而不是单个tweet,从而使LDA主题模型适用于Twitter数据。在实时twitter数据上的大量实验表明,我们的方法优于基线方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信