把探索和开发分开

Leeat Yariv
{"title":"把探索和开发分开","authors":"Leeat Yariv","doi":"10.1145/3465456.3467524","DOIUrl":null,"url":null,"abstract":"A key tension in the study of experimentation revolves around the exploration of new possibilities and the exploitation of prior discoveries. Starting from Robbins (1952), a large literature in economics and statistics has married the two: Agents experiment by selecting potentially risky options and observing their resulting payoffs. This framework has been used in many applications, ranging from pricing decisions to labor market search. Nonetheless, in many applications, agents' exploration and exploitation need not be intertwined. An investor may study stocks she is not invested in, an employee may explore alternative jobs while working, etc. The current paper focuses on the consequences of disentangling exploration from exploitation. This talk will cover some insights generated from work joint with Alessandro Lizzeri (Princeton University) and Eran Shmaya (Stony Brook University). We consider the classical Poisson bandit problem that has served as the canonical model for experimentation. We fully characterize the solution when exploration and exploitation are disentangled, both for the \"good news\" and \"bad news\" settings. We illustrate the stark differences the optimal exploration policy exhibits compared to the standard setting. In particular, we show that agents optimally utilize the option to observe projects different than the ones they act on. In the good news case, the optimal policy entails the continued exploration of a singular arm-no matter how pessimistic the decision-maker becomes about that arm-until news arrives. In contrast, in the bad news, exploration can involve the use of more than a single arm, but entails at most one switch. In all settings, the separation of exploration from exploitation guarantees asymptotic efficiency. BIO: Leeat Yariv is the Uwe E. Reinhardt Professor of Economics at Princeton University. She is also the director of the Princeton Experimental Laboratory for the Social Sciences (PExL), which she opened. She is the lead editor of AEJ: Micro and has served on the editorial boards of multiple journals. She is a member of the American Academy of Arts and Sciences and a fellow of the Econometric Society and the Society for the Advancement of Economic Theory. She is also a research associate of the National Bureau of Economic Research (NBER) and a research fellow of the Center for Economic and Policy Research (CEPR). Yariv's work focuses on market design, social networks, and political economy. She uses theory, lab experiments, and field studies to understand how individuals connect to one another and how they make decisions, on their own and collectively.","PeriodicalId":395676,"journal":{"name":"Proceedings of the 22nd ACM Conference on Economics and Computation","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Disentangling Exploration from Exploitation\",\"authors\":\"Leeat Yariv\",\"doi\":\"10.1145/3465456.3467524\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A key tension in the study of experimentation revolves around the exploration of new possibilities and the exploitation of prior discoveries. Starting from Robbins (1952), a large literature in economics and statistics has married the two: Agents experiment by selecting potentially risky options and observing their resulting payoffs. This framework has been used in many applications, ranging from pricing decisions to labor market search. Nonetheless, in many applications, agents' exploration and exploitation need not be intertwined. An investor may study stocks she is not invested in, an employee may explore alternative jobs while working, etc. The current paper focuses on the consequences of disentangling exploration from exploitation. This talk will cover some insights generated from work joint with Alessandro Lizzeri (Princeton University) and Eran Shmaya (Stony Brook University). We consider the classical Poisson bandit problem that has served as the canonical model for experimentation. We fully characterize the solution when exploration and exploitation are disentangled, both for the \\\"good news\\\" and \\\"bad news\\\" settings. We illustrate the stark differences the optimal exploration policy exhibits compared to the standard setting. In particular, we show that agents optimally utilize the option to observe projects different than the ones they act on. In the good news case, the optimal policy entails the continued exploration of a singular arm-no matter how pessimistic the decision-maker becomes about that arm-until news arrives. In contrast, in the bad news, exploration can involve the use of more than a single arm, but entails at most one switch. In all settings, the separation of exploration from exploitation guarantees asymptotic efficiency. BIO: Leeat Yariv is the Uwe E. Reinhardt Professor of Economics at Princeton University. She is also the director of the Princeton Experimental Laboratory for the Social Sciences (PExL), which she opened. She is the lead editor of AEJ: Micro and has served on the editorial boards of multiple journals. She is a member of the American Academy of Arts and Sciences and a fellow of the Econometric Society and the Society for the Advancement of Economic Theory. She is also a research associate of the National Bureau of Economic Research (NBER) and a research fellow of the Center for Economic and Policy Research (CEPR). Yariv's work focuses on market design, social networks, and political economy. She uses theory, lab experiments, and field studies to understand how individuals connect to one another and how they make decisions, on their own and collectively.\",\"PeriodicalId\":395676,\"journal\":{\"name\":\"Proceedings of the 22nd ACM Conference on Economics and Computation\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 22nd ACM Conference on Economics and Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3465456.3467524\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3465456.3467524","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

实验研究的一个关键张力围绕着探索新的可能性和利用先前的发现。从罗宾斯(1952)开始,大量的经济学和统计学文献将两者结合在一起:代理人通过选择潜在风险的选项并观察其结果的回报来进行实验。这个框架已经在许多应用中使用,从定价决策到劳动力市场搜索。尽管如此,在许多应用中,代理的探索和开发并不需要交织在一起。投资者可以研究她没有投资的股票,雇员可以在工作时寻找其他工作,等等。本文的重点是将勘探与开发分开的后果。本讲座将涵盖与Alessandro Lizzeri(普林斯顿大学)和Eran Shmaya(石溪大学)合作产生的一些见解。我们考虑了经典的泊松强盗问题,它已经成为实验的典型模型。我们在“好消息”和“坏消息”的情况下,充分描述了勘探和开发的解决方案。我们说明了与标准设置相比,最佳勘探策略所表现出的明显差异。特别是,我们展示了代理最优地利用选项来观察不同于他们所行动的项目。在好消息的情况下,最优的政策需要继续探索一个单一的领域——不管决策者对这一领域有多悲观——直到消息到来。相反,在坏消息方面,探索可能涉及使用多个手臂,但最多需要一个开关。在所有情况下,探索与开发的分离保证了渐近的效率。简介:利特·亚里夫是普林斯顿大学Uwe E. Reinhardt经济学教授。她还是普林斯顿社会科学实验实验室(PExL)的主任,该实验室由她创办。她是AEJ: Micro的主编,并在多家期刊的编委会任职。她是美国艺术与科学院的成员,也是计量经济学会和经济理论进步学会的会员。她也是美国国家经济研究局(NBER)的副研究员和经济与政策研究中心(CEPR)的研究员。亚里夫的研究重点是市场设计、社会网络和政治经济学。她运用理论、实验室实验和实地研究来了解个体如何相互联系,以及他们如何独自和集体做出决定。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Disentangling Exploration from Exploitation
A key tension in the study of experimentation revolves around the exploration of new possibilities and the exploitation of prior discoveries. Starting from Robbins (1952), a large literature in economics and statistics has married the two: Agents experiment by selecting potentially risky options and observing their resulting payoffs. This framework has been used in many applications, ranging from pricing decisions to labor market search. Nonetheless, in many applications, agents' exploration and exploitation need not be intertwined. An investor may study stocks she is not invested in, an employee may explore alternative jobs while working, etc. The current paper focuses on the consequences of disentangling exploration from exploitation. This talk will cover some insights generated from work joint with Alessandro Lizzeri (Princeton University) and Eran Shmaya (Stony Brook University). We consider the classical Poisson bandit problem that has served as the canonical model for experimentation. We fully characterize the solution when exploration and exploitation are disentangled, both for the "good news" and "bad news" settings. We illustrate the stark differences the optimal exploration policy exhibits compared to the standard setting. In particular, we show that agents optimally utilize the option to observe projects different than the ones they act on. In the good news case, the optimal policy entails the continued exploration of a singular arm-no matter how pessimistic the decision-maker becomes about that arm-until news arrives. In contrast, in the bad news, exploration can involve the use of more than a single arm, but entails at most one switch. In all settings, the separation of exploration from exploitation guarantees asymptotic efficiency. BIO: Leeat Yariv is the Uwe E. Reinhardt Professor of Economics at Princeton University. She is also the director of the Princeton Experimental Laboratory for the Social Sciences (PExL), which she opened. She is the lead editor of AEJ: Micro and has served on the editorial boards of multiple journals. She is a member of the American Academy of Arts and Sciences and a fellow of the Econometric Society and the Society for the Advancement of Economic Theory. She is also a research associate of the National Bureau of Economic Research (NBER) and a research fellow of the Center for Economic and Policy Research (CEPR). Yariv's work focuses on market design, social networks, and political economy. She uses theory, lab experiments, and field studies to understand how individuals connect to one another and how they make decisions, on their own and collectively.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信