{"title":"Predicting customers’ cross-buying decisions: a two-stage machine learning approach","authors":"M. Kilinç, Robert Rohrhirsch","doi":"10.1080/2573234X.2022.2128447","DOIUrl":null,"url":null,"abstract":"ABSTRACT Predicting a customer’s cross-buying behaviour is a challenging problem for many organisations. In this paper, we propose a novel two-stage cross-buying prediction framework by integrating machine learning, feature engineering, and interpretation techniques. Specifically, the first stage aims to train an accurate complex black-box classification model with cross-validation and hyperparameter tuning. Then, the next stage uses the top ten most important predictors of the black-box model to obtain a simple rule-based interpretable model. We use a publicly available dataset published on the Harvard Dataverse to provide a practical case study. The results show that the rule-based model has a predictive performance as high as the complex model.","PeriodicalId":36417,"journal":{"name":"Journal of Business Analytics","volume":null,"pages":null},"PeriodicalIF":1.7000,"publicationDate":"2022-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Business Analytics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/2573234X.2022.2128447","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
ABSTRACT Predicting a customer’s cross-buying behaviour is a challenging problem for many organisations. In this paper, we propose a novel two-stage cross-buying prediction framework by integrating machine learning, feature engineering, and interpretation techniques. Specifically, the first stage aims to train an accurate complex black-box classification model with cross-validation and hyperparameter tuning. Then, the next stage uses the top ten most important predictors of the black-box model to obtain a simple rule-based interpretable model. We use a publicly available dataset published on the Harvard Dataverse to provide a practical case study. The results show that the rule-based model has a predictive performance as high as the complex model.