Multivariate Arrival Times with Recurrent Neural Networks for Personalized Demand Forecasting

2018 IEEE International Conference on Data Mining Workshops (ICDMW) Pub Date : 2018-11-01 DOI:10.1109/ICDMW.2018.00121

Tianle Chen, Brian Keng, Javier Moreno

{"title":"Multivariate Arrival Times with Recurrent Neural Networks for Personalized Demand Forecasting","authors":"Tianle Chen, Brian Keng, Javier Moreno","doi":"10.1109/ICDMW.2018.00121","DOIUrl":null,"url":null,"abstract":"Access to a large variety of data across a massive population has made it possible to predict customer purchase patterns and responses to marketing campaigns. In particular, accurate demand forecasts for popular products with frequent repeat purchases are essential since these products are one of the main drivers of profits. However, buyer purchase patterns are extremely diverse and sparse on a per-product level due to population heterogeneity as well as dependence in purchase patterns across product categories. Traditional methods in survival analysis have proven effective in dealing with censored data by assuming parametric distributions on inter-arrival times. Distributional parameters are then fitted, typically in a regression framework. On the other hand, neural-network based models take a non-parametric approach to learn relations from a larger functional class. However, the lack of distributional assumptions make it difficult to model partially observed data. In this paper, we model directly the inter-arrival times as well as the partially observed information at each time step in a survival-based approach using Recurrent Neural Networks (RNN) to model purchase times jointly over several products. Instead of predicting a point estimate for inter-arrival times, the RNN outputs parameters that define a distributional estimate. The loss function is the negative log-likelihood of these parameters given partially observed data. This approach allows one to leverage both fully observed data as well as partial information. By externalizing the censoring problem through a log-likelihood loss function, we show that substantial improvements over state-of-the-art machine learning methods can be achieved. We present experimental results based on two open datasets as well as a study on a real dataset from a large retailer.","PeriodicalId":259600,"journal":{"name":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","volume":"173 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on Data Mining Workshops (ICDMW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDMW.2018.00121","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

Abstract

Access to a large variety of data across a massive population has made it possible to predict customer purchase patterns and responses to marketing campaigns. In particular, accurate demand forecasts for popular products with frequent repeat purchases are essential since these products are one of the main drivers of profits. However, buyer purchase patterns are extremely diverse and sparse on a per-product level due to population heterogeneity as well as dependence in purchase patterns across product categories. Traditional methods in survival analysis have proven effective in dealing with censored data by assuming parametric distributions on inter-arrival times. Distributional parameters are then fitted, typically in a regression framework. On the other hand, neural-network based models take a non-parametric approach to learn relations from a larger functional class. However, the lack of distributional assumptions make it difficult to model partially observed data. In this paper, we model directly the inter-arrival times as well as the partially observed information at each time step in a survival-based approach using Recurrent Neural Networks (RNN) to model purchase times jointly over several products. Instead of predicting a point estimate for inter-arrival times, the RNN outputs parameters that define a distributional estimate. The loss function is the negative log-likelihood of these parameters given partially observed data. This approach allows one to leverage both fully observed data as well as partial information. By externalizing the censoring problem through a log-likelihood loss function, we show that substantial improvements over state-of-the-art machine learning methods can be achieved. We present experimental results based on two open datasets as well as a study on a real dataset from a large retailer.

查看原文本刊更多论文

基于递归神经网络的个性化需求预测多元到达时间

通过访问大量人口的大量数据，可以预测客户的购买模式和对营销活动的反应。特别是，对经常重复购买的热门产品的准确需求预测至关重要，因为这些产品是利润的主要驱动因素之一。然而，由于人口异质性以及不同产品类别的购买模式的依赖性，买家的购买模式在每个产品层面上是极其多样化和稀疏的。传统的生存分析方法通过假设到达间隔时间的参数分布，已被证明是有效的。然后，通常在回归框架中拟合分布参数。另一方面，基于神经网络的模型采用非参数方法从更大的函数类中学习关系。然而，由于缺乏分布假设，很难对部分观测到的数据进行建模。在本文中，我们采用基于生存的方法，使用递归神经网络(RNN)对多个产品的联合购买时间进行建模，直接对到达间隔时间以及每个时间步的部分观察信息进行建模。RNN不是预测到达间隔时间的点估计，而是输出定义分布估计的参数。损失函数是给定部分观测数据的这些参数的负对数似然。这种方法既可以利用完全观察到的数据，也可以利用部分信息。通过对数似然损失函数外部化审查问题，我们表明可以实现对最先进的机器学习方法的实质性改进。我们给出了基于两个开放数据集的实验结果，以及对来自大型零售商的真实数据集的研究。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE International Conference on Data Mining Workshops (ICDMW)

自引率

0.00%

发文量