{"title":"基于潜在狄利克雷分配的twitter主题建模","authors":"D. Ostrowski","doi":"10.1109/ICOSC.2015.7050858","DOIUrl":null,"url":null,"abstract":"Due to its predictive nature, Social Media has proved to be an important resource in support of the identification of trends. In Customer Relationship Management there is a need beyond trend identification which includes understanding the topics propagated through Social Networks. In this paper, we explore topic modeling by considering the techniques of Latent Dirichlet Allocation which is a generative probabilistic model for a collection of discrete data. We evaluate this technique from the perspective of classification as well as identification of noteworthy topics as it is applied to a filtered collection of Twitter messages. Experiments show that these methods are effective for the identification of sub-topics as well as to support classification within large-scale corpora.","PeriodicalId":126701,"journal":{"name":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":"{\"title\":\"Using latent dirichlet allocation for topic modelling in twitter\",\"authors\":\"D. Ostrowski\",\"doi\":\"10.1109/ICOSC.2015.7050858\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to its predictive nature, Social Media has proved to be an important resource in support of the identification of trends. In Customer Relationship Management there is a need beyond trend identification which includes understanding the topics propagated through Social Networks. In this paper, we explore topic modeling by considering the techniques of Latent Dirichlet Allocation which is a generative probabilistic model for a collection of discrete data. We evaluate this technique from the perspective of classification as well as identification of noteworthy topics as it is applied to a filtered collection of Twitter messages. Experiments show that these methods are effective for the identification of sub-topics as well as to support classification within large-scale corpora.\",\"PeriodicalId\":126701,\"journal\":{\"name\":\"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-03-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"48\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICOSC.2015.7050858\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2015 IEEE 9th International Conference on Semantic Computing (IEEE ICSC 2015)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICOSC.2015.7050858","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Using latent dirichlet allocation for topic modelling in twitter
Due to its predictive nature, Social Media has proved to be an important resource in support of the identification of trends. In Customer Relationship Management there is a need beyond trend identification which includes understanding the topics propagated through Social Networks. In this paper, we explore topic modeling by considering the techniques of Latent Dirichlet Allocation which is a generative probabilistic model for a collection of discrete data. We evaluate this technique from the perspective of classification as well as identification of noteworthy topics as it is applied to a filtered collection of Twitter messages. Experiments show that these methods are effective for the identification of sub-topics as well as to support classification within large-scale corpora.