{"title":"一种用GPU加速LDA程序的方法","authors":"Yanjun Jiang, Hualong Wen, Zhanchun Gao","doi":"10.1109/ICNDC.2012.14","DOIUrl":null,"url":null,"abstract":"LDA (Latent Dirichlet Allocation) is a text modeling algorithm based on a generative probabilistic model. It is widely used to discover latent topics among a set of documents. Mahout has implemented LDA algorithm, however, the execution time of the LDA program is very long when processing a large amount of documents, because the documents are processed in sequence. This paper introduces a method to modify this program with CUDA toolkit provided by NVIDIA, in order that a group of documents could be processed in parallel on GPU. Using this method, the LDA program could be accelerated greatly.","PeriodicalId":151593,"journal":{"name":"2012 Third International Conference on Networking and Distributed Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Method of Accelerating LDA Program with GPU\",\"authors\":\"Yanjun Jiang, Hualong Wen, Zhanchun Gao\",\"doi\":\"10.1109/ICNDC.2012.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"LDA (Latent Dirichlet Allocation) is a text modeling algorithm based on a generative probabilistic model. It is widely used to discover latent topics among a set of documents. Mahout has implemented LDA algorithm, however, the execution time of the LDA program is very long when processing a large amount of documents, because the documents are processed in sequence. This paper introduces a method to modify this program with CUDA toolkit provided by NVIDIA, in order that a group of documents could be processed in parallel on GPU. Using this method, the LDA program could be accelerated greatly.\",\"PeriodicalId\":151593,\"journal\":{\"name\":\"2012 Third International Conference on Networking and Distributed Computing\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Third International Conference on Networking and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNDC.2012.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third International Conference on Networking and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNDC.2012.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
LDA (Latent Dirichlet Allocation) is a text modeling algorithm based on a generative probabilistic model. It is widely used to discover latent topics among a set of documents. Mahout has implemented LDA algorithm, however, the execution time of the LDA program is very long when processing a large amount of documents, because the documents are processed in sequence. This paper introduces a method to modify this program with CUDA toolkit provided by NVIDIA, in order that a group of documents could be processed in parallel on GPU. Using this method, the LDA program could be accelerated greatly.