{"title":"A hybrid Latent Dirichlet Allocation approach for topic classification","authors":"Chi-I Hsu, C. Chiu","doi":"10.1109/INISTA.2017.8001177","DOIUrl":null,"url":null,"abstract":"Many classification techniques can automatically summarize text into topics and accordingly identify topic terms from the online reviews. Among these techniques Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA) are some of the most often employed approaches. LDA is a probability generated model that projects a document into the topic space using Dirichlet Distribution, and each topic is a collection of words of the probability distribution. As the LDA extracted topics are often implicit, this study first applies LDA to examine the topics of online reviews for game apps in a supervised way. To improve the topic classification performance for LDA, this study proposes a hybrid LDA approach to use Genetic Algorithm (GA) in discovering optimal weights for LDA topics.","PeriodicalId":314687,"journal":{"name":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on INnovations in Intelligent SysTems and Applications (INISTA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INISTA.2017.8001177","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
Many classification techniques can automatically summarize text into topics and accordingly identify topic terms from the online reviews. Among these techniques Latent Dirichlet Allocation (LDA) and Latent Semantic Analysis (LSA) are some of the most often employed approaches. LDA is a probability generated model that projects a document into the topic space using Dirichlet Distribution, and each topic is a collection of words of the probability distribution. As the LDA extracted topics are often implicit, this study first applies LDA to examine the topics of online reviews for game apps in a supervised way. To improve the topic classification performance for LDA, this study proposes a hybrid LDA approach to use Genetic Algorithm (GA) in discovering optimal weights for LDA topics.