{"title":"Event detection from Twitter data","authors":"Jagrati Singh, Ishneet Kaur, Anil Kumar Singh","doi":"10.1109/ISCON47742.2019.9036286","DOIUrl":null,"url":null,"abstract":"Event detection from Twitter is important for people to extract valuable information about real world events. Automation of this task is challenging due to short and noisy nature of microblogging data. Topic modeling algorithms such as Latent Dirichlet Allocation (LDA) is the most popular algorithm to extract topics from news articles but not suitable for microblogging content due to the data sparsity problem. In this paper, we proposed a method to handle data sparsity problem that makes LDA topic model suitable for Twitter data by considering super tweet (aggregation of similar tweets) as a document instead of single tweet without modifying internal structure of model. Extensive experiments on real-time twitter data show that our approach outperforms the baseline approaches.","PeriodicalId":124412,"journal":{"name":"2019 4th International Conference on Information Systems and Computer Networks (ISCON)","volume":"75 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 4th International Conference on Information Systems and Computer Networks (ISCON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISCON47742.2019.9036286","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Event detection from Twitter is important for people to extract valuable information about real world events. Automation of this task is challenging due to short and noisy nature of microblogging data. Topic modeling algorithms such as Latent Dirichlet Allocation (LDA) is the most popular algorithm to extract topics from news articles but not suitable for microblogging content due to the data sparsity problem. In this paper, we proposed a method to handle data sparsity problem that makes LDA topic model suitable for Twitter data by considering super tweet (aggregation of similar tweets) as a document instead of single tweet without modifying internal structure of model. Extensive experiments on real-time twitter data show that our approach outperforms the baseline approaches.