An Exploratory Analysis of the Multi-Armed Bandit Problem

Stanton Hudja, Daniel Woods
{"title":"An Exploratory Analysis of the Multi-Armed Bandit Problem","authors":"Stanton Hudja, Daniel Woods","doi":"10.2139/ssrn.3942930","DOIUrl":null,"url":null,"abstract":"This paper conducts a laboratory experiment to analyze individual behavior in multi-armed bandit problems. Our experiment consists of four types of multi-armed bandit problems: (i) a two-armed indefinite horizon problem, (ii) a two-armed finite horizon problem, (iii) a three-armed indefinite horizon problem, and (iv) a three-armed finite horizon problem. We find that differences in behavior (switching, experimentation, best arm percentage) between these types of multi-armed bandit problems are consistent with predictions. However, we find that subjects use strategies that are different than predicted. We find that commonly suggested deterministic strategies are poor descriptors of subject behavior and that probabilistic strategies better fit the data. In particular, we find that a simple probabilistic ‘win-stay lose-shift’ strategy best fits most subjects.","PeriodicalId":263662,"journal":{"name":"ERN: Behavioral Economics (Topic)","volume":"128 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ERN: Behavioral Economics (Topic)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2139/ssrn.3942930","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

This paper conducts a laboratory experiment to analyze individual behavior in multi-armed bandit problems. Our experiment consists of four types of multi-armed bandit problems: (i) a two-armed indefinite horizon problem, (ii) a two-armed finite horizon problem, (iii) a three-armed indefinite horizon problem, and (iv) a three-armed finite horizon problem. We find that differences in behavior (switching, experimentation, best arm percentage) between these types of multi-armed bandit problems are consistent with predictions. However, we find that subjects use strategies that are different than predicted. We find that commonly suggested deterministic strategies are poor descriptors of subject behavior and that probabilistic strategies better fit the data. In particular, we find that a simple probabilistic ‘win-stay lose-shift’ strategy best fits most subjects.
多武装强盗问题的探索性分析
本文通过室内实验分析了多手强盗问题中的个体行为。我们的实验由四种类型的多臂强盗问题组成:(i)双臂无限视界问题,(ii)双臂有限视界问题,(iii)三臂无限视界问题,以及(iv)三臂有限视界问题。我们发现,这些类型的多臂强盗问题之间的行为差异(切换,实验,最佳手臂百分比)与预测一致。然而,我们发现受试者使用的策略与预测的不同。我们发现,通常建议的确定性策略是较差的描述主体的行为和概率策略更适合的数据。特别是,我们发现一个简单的概率“赢-留-输-换”策略最适合大多数受试者。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信