Muhammad Haseeb U. R. Rehman Khan, Kei Wakabayashi, Satoshi Fukuyama
{"title":"Events Insights Extraction from Twitter Using LDA and Day-Hashtag Pooling","authors":"Muhammad Haseeb U. R. Rehman Khan, Kei Wakabayashi, Satoshi Fukuyama","doi":"10.1145/3366030.3366090","DOIUrl":null,"url":null,"abstract":"News extraction from Twitter data is a hot topic. But can we extract much more than just news? The purpose of this research is to find, either news is the only information which can be extracted from Twitter data or it contains much more insights about real life events. So, we introduce a technique for analysis of Twitter's raw content. After pre-processing of tweets data, we apply hashtag pooling and extract topics using available topic modeling algorithm Latent Dirichlet Allocation (LDA) without modifying its core machinery. In the second part, estimated number of tweets per day and correlated top hashtags for each topic are calculated using day-hashtag pooling. Finally, the continues time series graph is constructed for topic analysis. Our findings show interesting results of bursty news detection, topic popularity, people's way to perceiving an event, real-life event's transition over time and before & after affects of a specific event.","PeriodicalId":446280,"journal":{"name":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 21st International Conference on Information Integration and Web-based Applications & Services","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3366030.3366090","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
News extraction from Twitter data is a hot topic. But can we extract much more than just news? The purpose of this research is to find, either news is the only information which can be extracted from Twitter data or it contains much more insights about real life events. So, we introduce a technique for analysis of Twitter's raw content. After pre-processing of tweets data, we apply hashtag pooling and extract topics using available topic modeling algorithm Latent Dirichlet Allocation (LDA) without modifying its core machinery. In the second part, estimated number of tweets per day and correlated top hashtags for each topic are calculated using day-hashtag pooling. Finally, the continues time series graph is constructed for topic analysis. Our findings show interesting results of bursty news detection, topic popularity, people's way to perceiving an event, real-life event's transition over time and before & after affects of a specific event.