Stephanie Beyer Diaz, Kristof Coussement, Arno De Caigny
{"title":"Improved decision-making through life event prediction: A case study in the financial services industry","authors":"Stephanie Beyer Diaz, Kristof Coussement, Arno De Caigny","doi":"10.1016/j.dss.2024.114342","DOIUrl":null,"url":null,"abstract":"<div><div>Life event prediction is an important tool for customer relationship management (CRM), because life events shift customers’ preferences towards different products and services. Existing life event research mainly uses cross-sectional data, whereas in the CRM field, incorporating longitudinal data is increasingly common. Because longitudinal data can capture the dynamics of customer behavior, opportunities arise to benchmark the power of longitudinal customer data for predictions of cross-sectional versus longitudinal life events. Therefore, this study compares statistical and machine learning (SaML) classifiers, such as logistic regression, random forest, and XGBoost, with long- and short-term memory networks (LSTM), using data represented in both cross-sectional and longitudinal setups for life event prediction. Through a real-life longitudinal customer data set from a European bank, the authors represent the longitudinal data in a cross-sectional data format, using featurization in the form of aggregation. The available data cover 42 end-of-month snapshots for 760,438 unique customers. For marketing decision-making literature, this article (1) introduces three novel life events (i.e., primary, secondary, and rental residence purchases) to life event predictions; (2) offers guidance for how to leverage longitudinal customer data, according to the comparison of various featurization approaches and benchmarking SaML classifiers against LSTM; and (3) clarifies the importance of features and timing for improving marketing decision-making dynamically. The results show that aggregating features over time is preferable as a featurization approach for cross-sectional modeling using SaML classifiers. Furthermore, LSTM can capture behavioral changes over time, unlike SaML classifiers. It also performs significantly better than SaML classifiers on the area under curve and F1 metrics. Insights into the uses of integrated gradients reveal that feature importance changes over time. An integrated gradients method can assist decision-makers in their efforts to plan effective communication with customers in advance, such as by allocating more resources to customers who exhibit high probabilities of a particular life event occurrence.</div></div>","PeriodicalId":55181,"journal":{"name":"Decision Support Systems","volume":"187 ","pages":"Article 114342"},"PeriodicalIF":6.7000,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Decision Support Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0167923624001751","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Life event prediction is an important tool for customer relationship management (CRM), because life events shift customers’ preferences towards different products and services. Existing life event research mainly uses cross-sectional data, whereas in the CRM field, incorporating longitudinal data is increasingly common. Because longitudinal data can capture the dynamics of customer behavior, opportunities arise to benchmark the power of longitudinal customer data for predictions of cross-sectional versus longitudinal life events. Therefore, this study compares statistical and machine learning (SaML) classifiers, such as logistic regression, random forest, and XGBoost, with long- and short-term memory networks (LSTM), using data represented in both cross-sectional and longitudinal setups for life event prediction. Through a real-life longitudinal customer data set from a European bank, the authors represent the longitudinal data in a cross-sectional data format, using featurization in the form of aggregation. The available data cover 42 end-of-month snapshots for 760,438 unique customers. For marketing decision-making literature, this article (1) introduces three novel life events (i.e., primary, secondary, and rental residence purchases) to life event predictions; (2) offers guidance for how to leverage longitudinal customer data, according to the comparison of various featurization approaches and benchmarking SaML classifiers against LSTM; and (3) clarifies the importance of features and timing for improving marketing decision-making dynamically. The results show that aggregating features over time is preferable as a featurization approach for cross-sectional modeling using SaML classifiers. Furthermore, LSTM can capture behavioral changes over time, unlike SaML classifiers. It also performs significantly better than SaML classifiers on the area under curve and F1 metrics. Insights into the uses of integrated gradients reveal that feature importance changes over time. An integrated gradients method can assist decision-makers in their efforts to plan effective communication with customers in advance, such as by allocating more resources to customers who exhibit high probabilities of a particular life event occurrence.
期刊介绍:
The common thread of articles published in Decision Support Systems is their relevance to theoretical and technical issues in the support of enhanced decision making. The areas addressed may include foundations, functionality, interfaces, implementation, impacts, and evaluation of decision support systems (DSSs).