{"title":"基于Mahout的协同过滤推荐算法研究","authors":"Hui Cao, Liyang Yan","doi":"10.12783/dteees/peems2019/34001","DOIUrl":null,"url":null,"abstract":"This paper studies the recommended algorithm for the Mahout machine learning platform. The principle analysis of the current mainstream recommendation algorithm is based on project-based collaborative filtering recommendation. The recommendation algorithm of Book-Crossing data set is implemented by using the collaborative filtering algorithm provided by Mahout. The similarity distance and other parameters in the general recommendation algorithm are used to compare and analyze the recommended results. Introduction With the rapid development of Internet technology, and with the progress of intelligent terminal equipment, mobile Internet has risen. It makes more convenient for people to publish and share information, at the same time, it brings a lot of data information to people. When we enjoy the convenience brought to us by the information age, we also produce various kinds of information. The application scenario of traditional search engine is that users can clearly know their needs through keywords and words to search. However, when users cannot express their needs or have no clear and effective search content, recommendation system emerges as an emerging technology to make up for the shortcomings of traditional information search engines. It uses different recommendation algorithms to model the user's preferences, and predicts the items or information that the user may be interested in according to the model to recommend a user. Project-based collaborative filtering recommendation algorithm is one of the most widely used and effective recommendation algorithms. Gradually, recommendation systems have become the main functions of IT companies that rely on information and data, such as Taobao, Today's headlines and NetEase Cloud Music. The development of recommendation algorithms has developed rapidly from collaborative filtering algorithm to implicit semantic model, and then to deep learning model. The goal of recommendation system is to predict users' preferences through accurate calculation, to achieve the best recommendation effect by coordinating algorithms, system functions and user experience, and to enhance consumers' user experience with more intelligence and humanity. By analyzing the Mahout Recommendation algorithm and taking the book recommendation system as an example, the results of the recommendation algorithm under different parameters are analyzed and compared. Recommendation Based on Collaborative Filtering The recommendation algorithm based on collaborative filtering is one of the most mature algorithms in the recommendation system. The core idea of recommendation based on collaborative filtering: using user behavior data information to extract features from users, which finds new user-to-item correlations by calculating user-to-item correlation to recommend for current users. Mainstream collaborative filtering algorithms include user-based filtering recommendation (User-Based CF) and Project-based Collaborative Filtering Recommendation (Item-Based CF) algorithm. These two collaborative filtering algorithms will be introduced below. User-Based CF: Recommend to User A items that are of interest to User B and which User A has not browsed. When user A is recommended by the system, user item set B, which is similar to A preference, is found by calculating user history information, and the items that user A has not purchased in item set B are recommended to A. The algorithm is divided into two steps: first, only user B with similar preference to A is found, and recommendation A is not purchased from item B. Item-Based CF: Recommend to User A the similar item B of the item A bought before. It does not calculate the similarity between items according to their content attribute characteristics. It calculates the similarity between the items to be recommended based on the user's historical information. If most people like item A and item B are the same, then item A and item B are similar. We will recommend item B to someone who likes item A but does not choose item B. The algorithm is divided into two steps: first, the similarity between items is calculated and a recommendation list is generated for users according to the similarity between users and items.","PeriodicalId":11324,"journal":{"name":"DEStech Transactions on Environment, Energy and Earth Sciences","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Research on Collaborative Filtering Recommendation Algorithm Based on Mahout\",\"authors\":\"Hui Cao, Liyang Yan\",\"doi\":\"10.12783/dteees/peems2019/34001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper studies the recommended algorithm for the Mahout machine learning platform. The principle analysis of the current mainstream recommendation algorithm is based on project-based collaborative filtering recommendation. The recommendation algorithm of Book-Crossing data set is implemented by using the collaborative filtering algorithm provided by Mahout. The similarity distance and other parameters in the general recommendation algorithm are used to compare and analyze the recommended results. Introduction With the rapid development of Internet technology, and with the progress of intelligent terminal equipment, mobile Internet has risen. It makes more convenient for people to publish and share information, at the same time, it brings a lot of data information to people. When we enjoy the convenience brought to us by the information age, we also produce various kinds of information. The application scenario of traditional search engine is that users can clearly know their needs through keywords and words to search. However, when users cannot express their needs or have no clear and effective search content, recommendation system emerges as an emerging technology to make up for the shortcomings of traditional information search engines. It uses different recommendation algorithms to model the user's preferences, and predicts the items or information that the user may be interested in according to the model to recommend a user. Project-based collaborative filtering recommendation algorithm is one of the most widely used and effective recommendation algorithms. Gradually, recommendation systems have become the main functions of IT companies that rely on information and data, such as Taobao, Today's headlines and NetEase Cloud Music. The development of recommendation algorithms has developed rapidly from collaborative filtering algorithm to implicit semantic model, and then to deep learning model. The goal of recommendation system is to predict users' preferences through accurate calculation, to achieve the best recommendation effect by coordinating algorithms, system functions and user experience, and to enhance consumers' user experience with more intelligence and humanity. By analyzing the Mahout Recommendation algorithm and taking the book recommendation system as an example, the results of the recommendation algorithm under different parameters are analyzed and compared. Recommendation Based on Collaborative Filtering The recommendation algorithm based on collaborative filtering is one of the most mature algorithms in the recommendation system. The core idea of recommendation based on collaborative filtering: using user behavior data information to extract features from users, which finds new user-to-item correlations by calculating user-to-item correlation to recommend for current users. Mainstream collaborative filtering algorithms include user-based filtering recommendation (User-Based CF) and Project-based Collaborative Filtering Recommendation (Item-Based CF) algorithm. These two collaborative filtering algorithms will be introduced below. User-Based CF: Recommend to User A items that are of interest to User B and which User A has not browsed. When user A is recommended by the system, user item set B, which is similar to A preference, is found by calculating user history information, and the items that user A has not purchased in item set B are recommended to A. The algorithm is divided into two steps: first, only user B with similar preference to A is found, and recommendation A is not purchased from item B. Item-Based CF: Recommend to User A the similar item B of the item A bought before. It does not calculate the similarity between items according to their content attribute characteristics. It calculates the similarity between the items to be recommended based on the user's historical information. If most people like item A and item B are the same, then item A and item B are similar. We will recommend item B to someone who likes item A but does not choose item B. The algorithm is divided into two steps: first, the similarity between items is calculated and a recommendation list is generated for users according to the similarity between users and items.\",\"PeriodicalId\":11324,\"journal\":{\"name\":\"DEStech Transactions on Environment, Energy and Earth Sciences\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"DEStech Transactions on Environment, Energy and Earth Sciences\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.12783/dteees/peems2019/34001\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"DEStech Transactions on Environment, Energy and Earth Sciences","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.12783/dteees/peems2019/34001","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Research on Collaborative Filtering Recommendation Algorithm Based on Mahout
This paper studies the recommended algorithm for the Mahout machine learning platform. The principle analysis of the current mainstream recommendation algorithm is based on project-based collaborative filtering recommendation. The recommendation algorithm of Book-Crossing data set is implemented by using the collaborative filtering algorithm provided by Mahout. The similarity distance and other parameters in the general recommendation algorithm are used to compare and analyze the recommended results. Introduction With the rapid development of Internet technology, and with the progress of intelligent terminal equipment, mobile Internet has risen. It makes more convenient for people to publish and share information, at the same time, it brings a lot of data information to people. When we enjoy the convenience brought to us by the information age, we also produce various kinds of information. The application scenario of traditional search engine is that users can clearly know their needs through keywords and words to search. However, when users cannot express their needs or have no clear and effective search content, recommendation system emerges as an emerging technology to make up for the shortcomings of traditional information search engines. It uses different recommendation algorithms to model the user's preferences, and predicts the items or information that the user may be interested in according to the model to recommend a user. Project-based collaborative filtering recommendation algorithm is one of the most widely used and effective recommendation algorithms. Gradually, recommendation systems have become the main functions of IT companies that rely on information and data, such as Taobao, Today's headlines and NetEase Cloud Music. The development of recommendation algorithms has developed rapidly from collaborative filtering algorithm to implicit semantic model, and then to deep learning model. The goal of recommendation system is to predict users' preferences through accurate calculation, to achieve the best recommendation effect by coordinating algorithms, system functions and user experience, and to enhance consumers' user experience with more intelligence and humanity. By analyzing the Mahout Recommendation algorithm and taking the book recommendation system as an example, the results of the recommendation algorithm under different parameters are analyzed and compared. Recommendation Based on Collaborative Filtering The recommendation algorithm based on collaborative filtering is one of the most mature algorithms in the recommendation system. The core idea of recommendation based on collaborative filtering: using user behavior data information to extract features from users, which finds new user-to-item correlations by calculating user-to-item correlation to recommend for current users. Mainstream collaborative filtering algorithms include user-based filtering recommendation (User-Based CF) and Project-based Collaborative Filtering Recommendation (Item-Based CF) algorithm. These two collaborative filtering algorithms will be introduced below. User-Based CF: Recommend to User A items that are of interest to User B and which User A has not browsed. When user A is recommended by the system, user item set B, which is similar to A preference, is found by calculating user history information, and the items that user A has not purchased in item set B are recommended to A. The algorithm is divided into two steps: first, only user B with similar preference to A is found, and recommendation A is not purchased from item B. Item-Based CF: Recommend to User A the similar item B of the item A bought before. It does not calculate the similarity between items according to their content attribute characteristics. It calculates the similarity between the items to be recommended based on the user's historical information. If most people like item A and item B are the same, then item A and item B are similar. We will recommend item B to someone who likes item A but does not choose item B. The algorithm is divided into two steps: first, the similarity between items is calculated and a recommendation list is generated for users according to the similarity between users and items.