{"title":"基于模型推荐的Apache Spark分布式多模型学习研究","authors":"Anas Alzogbi, Polina Koleva, G. Lausen","doi":"10.1109/ICDEW.2019.00-12","DOIUrl":null,"url":null,"abstract":"Model-based approaches for Content-based Filtering (CBF) recommendation have the potential of generating representative users models owing to their ability to learn from users actions. However, the need for training an individual model for each user leads to a scalability issue and brings a high computational cost that contributes to the limited adaptation of model-based approaches as efficient CBF recommenders. This is particularly relevant for production systems where the recommender is expected to serve a large number of users. In this work, we address the efficiency issue of model-based CBF recommender systems and present a new approach for distributed multi-model learning based on Apache Spark. We use Ranking SVM as the underlying recommendation algorithm and present a distributed implementation that allows efficient training of multiple models in parallel using a collection of machines. We demonstrate the efficiency of our approach on a real-world dataset from citeulike and show that our approach can reduce the cost of multi-model learning without affecting the prediction accuracy.","PeriodicalId":186190,"journal":{"name":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Towards Distributed Multi-model Learning on Apache Spark for Model-Based Recommender\",\"authors\":\"Anas Alzogbi, Polina Koleva, G. Lausen\",\"doi\":\"10.1109/ICDEW.2019.00-12\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Model-based approaches for Content-based Filtering (CBF) recommendation have the potential of generating representative users models owing to their ability to learn from users actions. However, the need for training an individual model for each user leads to a scalability issue and brings a high computational cost that contributes to the limited adaptation of model-based approaches as efficient CBF recommenders. This is particularly relevant for production systems where the recommender is expected to serve a large number of users. In this work, we address the efficiency issue of model-based CBF recommender systems and present a new approach for distributed multi-model learning based on Apache Spark. We use Ranking SVM as the underlying recommendation algorithm and present a distributed implementation that allows efficient training of multiple models in parallel using a collection of machines. We demonstrate the efficiency of our approach on a real-world dataset from citeulike and show that our approach can reduce the cost of multi-model learning without affecting the prediction accuracy.\",\"PeriodicalId\":186190,\"journal\":{\"name\":\"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-04-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDEW.2019.00-12\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE 35th International Conference on Data Engineering Workshops (ICDEW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDEW.2019.00-12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Towards Distributed Multi-model Learning on Apache Spark for Model-Based Recommender
Model-based approaches for Content-based Filtering (CBF) recommendation have the potential of generating representative users models owing to their ability to learn from users actions. However, the need for training an individual model for each user leads to a scalability issue and brings a high computational cost that contributes to the limited adaptation of model-based approaches as efficient CBF recommenders. This is particularly relevant for production systems where the recommender is expected to serve a large number of users. In this work, we address the efficiency issue of model-based CBF recommender systems and present a new approach for distributed multi-model learning based on Apache Spark. We use Ranking SVM as the underlying recommendation algorithm and present a distributed implementation that allows efficient training of multiple models in parallel using a collection of machines. We demonstrate the efficiency of our approach on a real-world dataset from citeulike and show that our approach can reduce the cost of multi-model learning without affecting the prediction accuracy.