{"title":"Linear and Non-Linear Models for Purchase Prediction","authors":"Wenliang Chen, Zhenghua Li, Min Zhang","doi":"10.1145/2813448.2813518","DOIUrl":"https://doi.org/10.1145/2813448.2813518","url":null,"abstract":"In this paper, we present our approach for the task of product purchase prediction. In the task, there are a collection of sequences of click events: click sessions. For some of the sessions, there are also buying events. The target of this task is to predict whether a user is going to buy something or not in a session, and if the user is buying, which products (items) the user is going to buy. In our approach, we treat the task as a classification problem and use linear and non-linear models to make the predictions, and then build an ensemble system based on the output of the individual systems. The evaluation results show that our final system is effective on the test data.","PeriodicalId":324873,"journal":{"name":"Proceedings of the 2015 International ACM Recommender Systems Challenge","volume":"66 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116472936","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An ensemble approach for multi-label classification of item click sequences","authors":"A. Murat, Ya Gcı, Tevfik Aytekin, F. Gürgen","doi":"10.1145/2813448.2813516","DOIUrl":"https://doi.org/10.1145/2813448.2813516","url":null,"abstract":"In this paper, we describe our approach to RecSys 2015 challenge problem. Given a dataset of item click sessions, the problem is to predict whether a session results in a purchase and which items are purchased if the answer is yes. We define a simpler analogous problem where given an item and its session, we try to predict the probability of purchase for the given item. For each session, the predictions result in a set of purchased items or often an empty set. We apply monthly time windows over the dataset. For each item in a session, we engineer features regarding the session, the item properties, and the time window. Then, a balanced random forest classifier is trained to perform predictions on the test set. The dataset is particularly challenging due to privacy-preserving definition of a session, the class imbalance problem, and the volume of data. We report our findings with respect to feature engineering, the choice of sampling schemes, and classifier ensembles. Experimental results together with benefits and shortcomings of the proposed approach are discussed. The solution is efficient and practical in commodity computers.","PeriodicalId":324873,"journal":{"name":"Proceedings of the 2015 International ACM Recommender Systems Challenge","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127497581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chanyoung Park, Dong Hyun Kim, Jinoh Oh, Hwanjo Yu
{"title":"Predicting User Purchase in E-commerce by Comprehensive Feature Engineering and Decision Boundary Focused Under-Sampling","authors":"Chanyoung Park, Dong Hyun Kim, Jinoh Oh, Hwanjo Yu","doi":"10.1145/2813448.2813517","DOIUrl":"https://doi.org/10.1145/2813448.2813517","url":null,"abstract":"The goal of RecSys Challenge 2015 [2] is: (1) to predict which user will end up with a purchase and if so, (2) to predict items that he/she will buy given click/purchase data provided by YOOCHOOSE. It is hard to achieve the goal of this Challenge because (1) the data does not contain user demographics information and it contains a lot of missing values and (2) the volume of the dataset is massive with about 33 million clicks and 1 million purchase history and the class distribution (the ratio of non-purchased clicks to purchased clicks) is highly imbalanced. In order to efficiently solve these problems, we propose (1) Comprehensive Feature Engineering method (CFE) including imputation of missing values to make up for insufficiency of information and (2) Decision Boundary Focused Under-Sampling method (DBFUS) to cope with class imbalance problem and to reduce learning time and memory usage. Our proposed approach obtained 54403.6 points on the final leaderboard.","PeriodicalId":324873,"journal":{"name":"Proceedings of the 2015 International ACM Recommender Systems Challenge","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125999806","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}