{"title":"Research on statistics-based model for E-commerce user purchase prediction","authors":"Huailin Dong, Lingwei Xie, Zhongnan Zhang","doi":"10.1109/ICCSE.2015.7250308","DOIUrl":null,"url":null,"abstract":"This paper describes our work for ALIDATA DISCOVERY competition. Through analyzing massive real-world user action data provided by Tmall, one of the largest B2C online retail platforms in China, we try to predict future user purchases. The prediction results are judged by F1 Score that is consist of two parts, precision and recall rate. The provided data set contains more than 500 million action records from over 12 million distinct users. Such a massive data set drives us to finish the task in MapReduce fashion on the Open Data Processing Service (ODPS) platform. According to statistical results, we classify all users into different groups firstly. Then the rule model, timing model, statistics model are adopted for predicting future user purchases. By comparison, the statistics model obtains the best F1Score.","PeriodicalId":311451,"journal":{"name":"2015 10th International Conference on Computer Science & Education (ICCSE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 10th International Conference on Computer Science & Education (ICCSE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCSE.2015.7250308","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 5
Abstract
This paper describes our work for ALIDATA DISCOVERY competition. Through analyzing massive real-world user action data provided by Tmall, one of the largest B2C online retail platforms in China, we try to predict future user purchases. The prediction results are judged by F1 Score that is consist of two parts, precision and recall rate. The provided data set contains more than 500 million action records from over 12 million distinct users. Such a massive data set drives us to finish the task in MapReduce fashion on the Open Data Processing Service (ODPS) platform. According to statistical results, we classify all users into different groups firstly. Then the rule model, timing model, statistics model are adopted for predicting future user purchases. By comparison, the statistics model obtains the best F1Score.