An Exploratory Analysis of the Multi-Armed Bandit Problem

ERN: Behavioral Economics (Topic) Pub Date : 2021-10-14 DOI:10.2139/ssrn.3942930

Stanton Hudja, Daniel Woods

引用次数: 1

Abstract

This paper conducts a laboratory experiment to analyze individual behavior in multi-armed bandit problems. Our experiment consists of four types of multi-armed bandit problems: (i) a two-armed indefinite horizon problem, (ii) a two-armed finite horizon problem, (iii) a three-armed indefinite horizon problem, and (iv) a three-armed finite horizon problem. We find that differences in behavior (switching, experimentation, best arm percentage) between these types of multi-armed bandit problems are consistent with predictions. However, we find that subjects use strategies that are different than predicted. We find that commonly suggested deterministic strategies are poor descriptors of subject behavior and that probabilistic strategies better fit the data. In particular, we find that a simple probabilistic ‘win-stay lose-shift’ strategy best fits most subjects.

查看原文本刊更多论文

多武装强盗问题的探索性分析

本文通过室内实验分析了多手强盗问题中的个体行为。我们的实验由四种类型的多臂强盗问题组成:(i)双臂无限视界问题，(ii)双臂有限视界问题，(iii)三臂无限视界问题，以及(iv)三臂有限视界问题。我们发现，这些类型的多臂强盗问题之间的行为差异(切换，实验，最佳手臂百分比)与预测一致。然而，我们发现受试者使用的策略与预测的不同。我们发现，通常建议的确定性策略是较差的描述主体的行为和概率策略更适合的数据。特别是，我们发现一个简单的概率“赢-留-输-换”策略最适合大多数受试者。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ERN: Behavioral Economics (Topic)

自引率

0.00%

发文量