一种用GPU加速LDA程序的方法

2012 Third International Conference on Networking and Distributed Computing Pub Date : 2012-10-21 DOI:10.1109/ICNDC.2012.14

Yanjun Jiang, Hualong Wen, Zhanchun Gao

{"title":"一种用GPU加速LDA程序的方法","authors":"Yanjun Jiang, Hualong Wen, Zhanchun Gao","doi":"10.1109/ICNDC.2012.14","DOIUrl":null,"url":null,"abstract":"LDA (Latent Dirichlet Allocation) is a text modeling algorithm based on a generative probabilistic model. It is widely used to discover latent topics among a set of documents. Mahout has implemented LDA algorithm, however, the execution time of the LDA program is very long when processing a large amount of documents, because the documents are processed in sequence. This paper introduces a method to modify this program with CUDA toolkit provided by NVIDIA, in order that a group of documents could be processed in parallel on GPU. Using this method, the LDA program could be accelerated greatly.","PeriodicalId":151593,"journal":{"name":"2012 Third International Conference on Networking and Distributed Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A Method of Accelerating LDA Program with GPU\",\"authors\":\"Yanjun Jiang, Hualong Wen, Zhanchun Gao\",\"doi\":\"10.1109/ICNDC.2012.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"LDA (Latent Dirichlet Allocation) is a text modeling algorithm based on a generative probabilistic model. It is widely used to discover latent topics among a set of documents. Mahout has implemented LDA algorithm, however, the execution time of the LDA program is very long when processing a large amount of documents, because the documents are processed in sequence. This paper introduces a method to modify this program with CUDA toolkit provided by NVIDIA, in order that a group of documents could be processed in parallel on GPU. Using this method, the LDA program could be accelerated greatly.\",\"PeriodicalId\":151593,\"journal\":{\"name\":\"2012 Third International Conference on Networking and Distributed Computing\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 Third International Conference on Networking and Distributed Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICNDC.2012.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Third International Conference on Networking and Distributed Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNDC.2012.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

LDA (Latent Dirichlet Allocation)是一种基于生成概率模型的文本建模算法。它被广泛用于发现一组文档中的潜在主题。Mahout实现了LDA算法，但是在处理大量文档时，由于文档是按顺序处理的，所以LDA程序的执行时间很长。本文介绍了一种利用NVIDIA提供的CUDA工具包对该程序进行修改的方法，使一组文档可以在GPU上并行处理。采用这种方法可以大大加快LDA程序的速度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Method of Accelerating LDA Program with GPU

LDA (Latent Dirichlet Allocation) is a text modeling algorithm based on a generative probabilistic model. It is widely used to discover latent topics among a set of documents. Mahout has implemented LDA algorithm, however, the execution time of the LDA program is very long when processing a large amount of documents, because the documents are processed in sequence. This paper introduces a method to modify this program with CUDA toolkit provided by NVIDIA, in order that a group of documents could be processed in parallel on GPU. Using this method, the LDA program could be accelerated greatly.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 Third International Conference on Networking and Distributed Computing

自引率

0.00%

发文量