{"title":"并行naïve基于Bayes回归模型的协同过滤推荐算法及其在Hadoop上的大数据实现","authors":"Shiqi Wen, Cheng Wang, Haibo Li, Guoqi Zheng","doi":"10.1504/IJITM.2019.10021200","DOIUrl":null,"url":null,"abstract":"Collaborative filtering (CF) algorithms are widely used in a lot of recommender systems. However, space-time overhead and high computational complexity hinder their use in large-scale systems. This paper implements the parallel naive Bayes regression model based collaborative filtering recommendation algorithm on Hadoop computing platform to scalability problem of CF. Firstly, this paper analysis the inherent parallelism of the naive Bayesian regression model and constructs the theoretical model of naive Bayesian parallelisation. Secondly, the parallel naive Bayes regression model-based collaborative filtering recommendation algorithm is realised on Hadoop platform with distributed Hadoop distributed file system (HDFS) and MapReduce as the transparent distributed infrastructure. And its temporal-spatial overhead, speedup is discussed. Finally, applying parallel the naive Bayes regression model-based collaborative filtering recommendation algorithm to a large dataset. The experiment results on Netflix dataset show that this method has high scalability and less space-time overhead, which is suitable for real-time recommendation on large dataset.","PeriodicalId":340536,"journal":{"name":"Int. J. Inf. Technol. Manag.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Parallel naïve Bayes regression model-based collaborative filtering recommendation algorithm and its realisation on Hadoop for big data\",\"authors\":\"Shiqi Wen, Cheng Wang, Haibo Li, Guoqi Zheng\",\"doi\":\"10.1504/IJITM.2019.10021200\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Collaborative filtering (CF) algorithms are widely used in a lot of recommender systems. However, space-time overhead and high computational complexity hinder their use in large-scale systems. This paper implements the parallel naive Bayes regression model based collaborative filtering recommendation algorithm on Hadoop computing platform to scalability problem of CF. Firstly, this paper analysis the inherent parallelism of the naive Bayesian regression model and constructs the theoretical model of naive Bayesian parallelisation. Secondly, the parallel naive Bayes regression model-based collaborative filtering recommendation algorithm is realised on Hadoop platform with distributed Hadoop distributed file system (HDFS) and MapReduce as the transparent distributed infrastructure. And its temporal-spatial overhead, speedup is discussed. Finally, applying parallel the naive Bayes regression model-based collaborative filtering recommendation algorithm to a large dataset. The experiment results on Netflix dataset show that this method has high scalability and less space-time overhead, which is suitable for real-time recommendation on large dataset.\",\"PeriodicalId\":340536,\"journal\":{\"name\":\"Int. J. Inf. Technol. Manag.\",\"volume\":\"15 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Inf. Technol. Manag.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/IJITM.2019.10021200\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Inf. Technol. Manag.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJITM.2019.10021200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parallel naïve Bayes regression model-based collaborative filtering recommendation algorithm and its realisation on Hadoop for big data
Collaborative filtering (CF) algorithms are widely used in a lot of recommender systems. However, space-time overhead and high computational complexity hinder their use in large-scale systems. This paper implements the parallel naive Bayes regression model based collaborative filtering recommendation algorithm on Hadoop computing platform to scalability problem of CF. Firstly, this paper analysis the inherent parallelism of the naive Bayesian regression model and constructs the theoretical model of naive Bayesian parallelisation. Secondly, the parallel naive Bayes regression model-based collaborative filtering recommendation algorithm is realised on Hadoop platform with distributed Hadoop distributed file system (HDFS) and MapReduce as the transparent distributed infrastructure. And its temporal-spatial overhead, speedup is discussed. Finally, applying parallel the naive Bayes regression model-based collaborative filtering recommendation algorithm to a large dataset. The experiment results on Netflix dataset show that this method has high scalability and less space-time overhead, which is suitable for real-time recommendation on large dataset.