{"title":"基于Spark的动态混合协同过滤的改进","authors":"Haorui Li, Qiang Huang","doi":"10.1109/ICCC47050.2019.9064416","DOIUrl":null,"url":null,"abstract":"Iterative computation due to the advantage of memory computing framework in Spark big data platform, so This paper applies ALS model recommendation algorithm on Spark platform and improves its calculation method. Considering more practical factors to get more accurate result sets, we first use C-Means clustering to classify data preprocessing, so as to reduce the calculation of redundant data and the sparsity of matrix. Secondly, the cosine similarity and Pearson similarity are applied to improve the user similarity calculation. Finally, a mixed recommendation function is constructed. On the Spark distributed large data platform, this method trains and compares the results offline and real-time through MovieLens data set, which shows that it reduces the computing time, improves the efficiency and accuracy of the algorithm.","PeriodicalId":6739,"journal":{"name":"2019 IEEE 5th International Conference on Computer and Communications (ICCC)","volume":"198 1","pages":"8-12"},"PeriodicalIF":0.0000,"publicationDate":"2019-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Improvement of Dynamic Hybrid Collaborative Filtering Based on Spark\",\"authors\":\"Haorui Li, Qiang Huang\",\"doi\":\"10.1109/ICCC47050.2019.9064416\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Iterative computation due to the advantage of memory computing framework in Spark big data platform, so This paper applies ALS model recommendation algorithm on Spark platform and improves its calculation method. Considering more practical factors to get more accurate result sets, we first use C-Means clustering to classify data preprocessing, so as to reduce the calculation of redundant data and the sparsity of matrix. Secondly, the cosine similarity and Pearson similarity are applied to improve the user similarity calculation. Finally, a mixed recommendation function is constructed. On the Spark distributed large data platform, this method trains and compares the results offline and real-time through MovieLens data set, which shows that it reduces the computing time, improves the efficiency and accuracy of the algorithm.\",\"PeriodicalId\":6739,\"journal\":{\"name\":\"2019 IEEE 5th International Conference on Computer and Communications (ICCC)\",\"volume\":\"198 1\",\"pages\":\"8-12\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 5th International Conference on Computer and Communications (ICCC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCC47050.2019.9064416\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 5th International Conference on Computer and Communications (ICCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCC47050.2019.9064416","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Improvement of Dynamic Hybrid Collaborative Filtering Based on Spark
Iterative computation due to the advantage of memory computing framework in Spark big data platform, so This paper applies ALS model recommendation algorithm on Spark platform and improves its calculation method. Considering more practical factors to get more accurate result sets, we first use C-Means clustering to classify data preprocessing, so as to reduce the calculation of redundant data and the sparsity of matrix. Secondly, the cosine similarity and Pearson similarity are applied to improve the user similarity calculation. Finally, a mixed recommendation function is constructed. On the Spark distributed large data platform, this method trains and compares the results offline and real-time through MovieLens data set, which shows that it reduces the computing time, improves the efficiency and accuracy of the algorithm.