Rafael Valero-Fernandez, David J. Collins, Colin Rigby, James Bailey
{"title":"Towards Accurate Predictions of Customer Purchasing Patterns","authors":"Rafael Valero-Fernandez, David J. Collins, Colin Rigby, James Bailey","doi":"10.1109/CIT.2017.58","DOIUrl":null,"url":null,"abstract":"A range of algorithms was used to classify online retail customers of a UK company using historical transaction data. The predictive capabilities of the classifiers were assessed using linear regression, Lasso and regression trees. Unlike most related studies, classifications were based upon specific and marketing focused customer behaviours. Prediction accuracy on untrained customers was generally better than 80%. The models implemented (and compared) for classification were: Logistic Regression, Quadratic Discriminant Analysis, Linear SVM, RBF SVM, Gaussian Process, Decision Tree, Random Forest and Multi-layer Perceptron (Neural Network). Postcode data was then used to classify solely on demographics derived from the UK Land Registry and similar public data sources. Prediction accuracy remained better than 60%.","PeriodicalId":378423,"journal":{"name":"2017 IEEE International Conference on Computer and Information Technology (CIT)","volume":"230 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Computer and Information Technology (CIT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CIT.2017.58","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
A range of algorithms was used to classify online retail customers of a UK company using historical transaction data. The predictive capabilities of the classifiers were assessed using linear regression, Lasso and regression trees. Unlike most related studies, classifications were based upon specific and marketing focused customer behaviours. Prediction accuracy on untrained customers was generally better than 80%. The models implemented (and compared) for classification were: Logistic Regression, Quadratic Discriminant Analysis, Linear SVM, RBF SVM, Gaussian Process, Decision Tree, Random Forest and Multi-layer Perceptron (Neural Network). Postcode data was then used to classify solely on demographics derived from the UK Land Registry and similar public data sources. Prediction accuracy remained better than 60%.