{"title":"Parallel naïve Bayes regression model-based collaborative filtering recommendation algorithm and its realisation on Hadoop for big data","authors":"Shiqi Wen, Cheng Wang, Haibo Li, Guoqi Zheng","doi":"10.1504/IJITM.2019.10021200","DOIUrl":null,"url":null,"abstract":"Collaborative filtering (CF) algorithms are widely used in a lot of recommender systems. However, space-time overhead and high computational complexity hinder their use in large-scale systems. This paper implements the parallel naive Bayes regression model based collaborative filtering recommendation algorithm on Hadoop computing platform to scalability problem of CF. Firstly, this paper analysis the inherent parallelism of the naive Bayesian regression model and constructs the theoretical model of naive Bayesian parallelisation. Secondly, the parallel naive Bayes regression model-based collaborative filtering recommendation algorithm is realised on Hadoop platform with distributed Hadoop distributed file system (HDFS) and MapReduce as the transparent distributed infrastructure. And its temporal-spatial overhead, speedup is discussed. Finally, applying parallel the naive Bayes regression model-based collaborative filtering recommendation algorithm to a large dataset. The experiment results on Netflix dataset show that this method has high scalability and less space-time overhead, which is suitable for real-time recommendation on large dataset.","PeriodicalId":340536,"journal":{"name":"Int. J. Inf. Technol. Manag.","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Inf. Technol. Manag.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/IJITM.2019.10021200","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Collaborative filtering (CF) algorithms are widely used in a lot of recommender systems. However, space-time overhead and high computational complexity hinder their use in large-scale systems. This paper implements the parallel naive Bayes regression model based collaborative filtering recommendation algorithm on Hadoop computing platform to scalability problem of CF. Firstly, this paper analysis the inherent parallelism of the naive Bayesian regression model and constructs the theoretical model of naive Bayesian parallelisation. Secondly, the parallel naive Bayes regression model-based collaborative filtering recommendation algorithm is realised on Hadoop platform with distributed Hadoop distributed file system (HDFS) and MapReduce as the transparent distributed infrastructure. And its temporal-spatial overhead, speedup is discussed. Finally, applying parallel the naive Bayes regression model-based collaborative filtering recommendation algorithm to a large dataset. The experiment results on Netflix dataset show that this method has high scalability and less space-time overhead, which is suitable for real-time recommendation on large dataset.