{"title":"用于主题提取的分支组合PLSA","authors":"Jiali Lin, Zhiqiang Wei, Z. Li","doi":"10.14257/ijdta.2017.10.1.14","DOIUrl":null,"url":null,"abstract":"Li (lizhen0130@gmail.com) Abstract With the developing of the Internet technology, the information on the network is expanding at the speed of geometric progression. Facing such vast network information, quickly extracting the important information becomes the urgent needs. The subject extraction model is a good solution to the problem. In this paper, a new model based on Probabilistic Latent Semantic Analysis (PLSA) is proposed which is called Branch-combined PLSA (BPLSA). BPLSA divides training data into two subsets, and trains subsets separately first, then the global training is implemented. At the same time, Message Passing Interface (MPI) is used for parallel computing to speed up the proposed method. Through the parallelization of the BPLSA, the efficiency is","PeriodicalId":13926,"journal":{"name":"International journal of database theory and application","volume":"21 1","pages":"149-162"},"PeriodicalIF":0.0000,"publicationDate":"2017-01-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Branch-combined PLSA for Topic Extraction\",\"authors\":\"Jiali Lin, Zhiqiang Wei, Z. Li\",\"doi\":\"10.14257/ijdta.2017.10.1.14\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Li (lizhen0130@gmail.com) Abstract With the developing of the Internet technology, the information on the network is expanding at the speed of geometric progression. Facing such vast network information, quickly extracting the important information becomes the urgent needs. The subject extraction model is a good solution to the problem. In this paper, a new model based on Probabilistic Latent Semantic Analysis (PLSA) is proposed which is called Branch-combined PLSA (BPLSA). BPLSA divides training data into two subsets, and trains subsets separately first, then the global training is implemented. At the same time, Message Passing Interface (MPI) is used for parallel computing to speed up the proposed method. Through the parallelization of the BPLSA, the efficiency is\",\"PeriodicalId\":13926,\"journal\":{\"name\":\"International journal of database theory and application\",\"volume\":\"21 1\",\"pages\":\"149-162\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-01-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International journal of database theory and application\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.14257/ijdta.2017.10.1.14\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International journal of database theory and application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14257/ijdta.2017.10.1.14","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Li (lizhen0130@gmail.com) Abstract With the developing of the Internet technology, the information on the network is expanding at the speed of geometric progression. Facing such vast network information, quickly extracting the important information becomes the urgent needs. The subject extraction model is a good solution to the problem. In this paper, a new model based on Probabilistic Latent Semantic Analysis (PLSA) is proposed which is called Branch-combined PLSA (BPLSA). BPLSA divides training data into two subsets, and trains subsets separately first, then the global training is implemented. At the same time, Message Passing Interface (MPI) is used for parallel computing to speed up the proposed method. Through the parallelization of the BPLSA, the efficiency is