{"title":"Indonesian text feature extraction using gibbs sampling and mean variational inference latent dirichlet allocation","authors":"P. Prihatini, I. Putra, I. Giriantari, M. Sudarma","doi":"10.1109/QIR.2017.8168448","DOIUrl":null,"url":null,"abstract":"Latent Dirichlet Allocation has been developed as topic-based method which uses reasoning to determine the topics of a document. There are many methods of reasoning used for Latent Dirichlet Allocation, including the Gibbs Sampling and Mean Variational Inference, the most widely used in research. However, there have not been many studies that discuss the implementation of these methods on the Indonesian text, so analysis is needed to compare its performance in generating feature extraction. Therefore, in this paper, will be implemented the method of reasoning Gibbs Sampling and Mean Variational Inference for Latent Dirichlet Allocation on Indonesian text. The objective is determining the performance of both algorithms on Indonesian text so it can provide a reference about the better reasoning method for Latent Dirichlet Allocation on Indonesian text. The research was implemented on digital Indonesia news text data with 100 documents. The tests are conducted on feature data as the result of extraction process using three type of evaluation metric. The test results show that Gibbs Sampling has a better performance than Mean Variational Inference for Latent Dirichlet Allocation on Indonesian text.","PeriodicalId":225743,"journal":{"name":"2017 15th International Conference on Quality in Research (QiR) : International Symposium on Electrical and Computer Engineering","volume":"201 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-07-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 15th International Conference on Quality in Research (QiR) : International Symposium on Electrical and Computer Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/QIR.2017.8168448","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Latent Dirichlet Allocation has been developed as topic-based method which uses reasoning to determine the topics of a document. There are many methods of reasoning used for Latent Dirichlet Allocation, including the Gibbs Sampling and Mean Variational Inference, the most widely used in research. However, there have not been many studies that discuss the implementation of these methods on the Indonesian text, so analysis is needed to compare its performance in generating feature extraction. Therefore, in this paper, will be implemented the method of reasoning Gibbs Sampling and Mean Variational Inference for Latent Dirichlet Allocation on Indonesian text. The objective is determining the performance of both algorithms on Indonesian text so it can provide a reference about the better reasoning method for Latent Dirichlet Allocation on Indonesian text. The research was implemented on digital Indonesia news text data with 100 documents. The tests are conducted on feature data as the result of extraction process using three type of evaluation metric. The test results show that Gibbs Sampling has a better performance than Mean Variational Inference for Latent Dirichlet Allocation on Indonesian text.