Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments

DecisionSciRN: Decision-Making in Marketing (Topic) Pub Date : 2016-03-29 DOI:10.2139/ssrn.2368523

Eric M. Schwartz, Eric T. Bradlow, P. Fader

{"title":"Customer Acquisition via Display Advertising Using Multi-Armed Bandit Experiments","authors":"Eric M. Schwartz, Eric T. Bradlow, P. Fader","doi":"10.2139/ssrn.2368523","DOIUrl":null,"url":null,"abstract":"Firms using online advertising regularly run experiments with multiple versions of their ads since they are uncertain about which ones are most effective. Within a campaign, firms try to adapt to intermediate results of their tests, optimizing what they earn while learning about their ads. But how should they decide what percentage of impressions to allocate to each ad? This paper answers that question, resolving the well-known \"learn-and-earn'' trade-off using multi-armed bandit (MAB) methods. The online advertiser's MAB problem, however, contains particular challenges, such as a hierarchical structure (ads within a website), attributes of actions (creative elements of an ad), and batched decisions (millions of impressions at a time), that are not fully accommodated by existing MAB methods. Our approach captures how the impact of observable ad attributes on ad effectiveness differs by website in unobserved ways, and our policy generates allocations of impressions that can be used in practice. We implemented this policy in a live field experiment delivering over 700 million ad impressions in an online display campaign with a large retail bank. Over the course of two months, our policy achieved an 8% improvement in the customer acquisition rate, relative to a control policy, without any additional costs to the bank. Beyond the actual experiment, we performed counterfactual simulations to evaluate a range of alternative model specifications and allocation rules in MAB policies. Finally, we show that customer acquisition would decrease about 10% if the firm were to optimize click through rates instead of conversion directly, a finding that has implications for understanding the marketing funnel.","PeriodicalId":319647,"journal":{"name":"DecisionSciRN: Decision-Making in Marketing (Topic)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"242","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"DecisionSciRN: Decision-Making in Marketing (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.2368523","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 242

Abstract

Firms using online advertising regularly run experiments with multiple versions of their ads since they are uncertain about which ones are most effective. Within a campaign, firms try to adapt to intermediate results of their tests, optimizing what they earn while learning about their ads. But how should they decide what percentage of impressions to allocate to each ad? This paper answers that question, resolving the well-known "learn-and-earn'' trade-off using multi-armed bandit (MAB) methods. The online advertiser's MAB problem, however, contains particular challenges, such as a hierarchical structure (ads within a website), attributes of actions (creative elements of an ad), and batched decisions (millions of impressions at a time), that are not fully accommodated by existing MAB methods. Our approach captures how the impact of observable ad attributes on ad effectiveness differs by website in unobserved ways, and our policy generates allocations of impressions that can be used in practice. We implemented this policy in a live field experiment delivering over 700 million ad impressions in an online display campaign with a large retail bank. Over the course of two months, our policy achieved an 8% improvement in the customer acquisition rate, relative to a control policy, without any additional costs to the bank. Beyond the actual experiment, we performed counterfactual simulations to evaluate a range of alternative model specifications and allocation rules in MAB policies. Finally, we show that customer acquisition would decrease about 10% if the firm were to optimize click through rates instead of conversion directly, a finding that has implications for understanding the marketing funnel.

查看原文本刊更多论文

基于多臂强盗实验的展示广告客户获取

使用在线广告的公司经常对他们的广告进行多种版本的实验，因为他们不确定哪种版本最有效。在广告活动中，公司试图适应他们测试的中间结果，在了解广告的同时优化他们的收入。但是他们应该如何决定分配给每个广告的印象百分比呢?本文回答了这个问题，使用多臂强盗(MAB)方法解决了众所周知的“学与赚”权衡。然而，在线广告商的MAB问题包含了一些特殊的挑战，比如层次结构(网站内的广告)、行为属性(广告的创意元素)和批量决策(一次数百万次的展示)，这些都不是现有的MAB方法所能完全适应的。我们的方法捕捉到可观察到的广告属性对广告效果的影响如何因网站而异，并且我们的政策产生了可以在实践中使用的印象分配。我们在与一家大型零售银行的在线展示活动中实施了这一政策，提供了超过7亿次的广告印象。在两个月的时间里，相对于控制政策，我们的政策在没有给银行带来任何额外成本的情况下，使客户获得率提高了8%。在实际实验之外，我们执行了反事实模拟来评估MAB策略中的一系列可选模型规范和分配规则。最后，我们表明，如果公司优化点击率而不是直接转换，客户获取将减少约10%，这一发现对理解营销渠道具有重要意义。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

DecisionSciRN: Decision-Making in Marketing (Topic)

自引率

0.00%

发文量