用局部处理对马尔可夫决策过程进行实验

Shuze Chen, David Simchi-Levi, Chonghuan Wang
{"title":"用局部处理对马尔可夫决策过程进行实验","authors":"Shuze Chen, David Simchi-Levi, Chonghuan Wang","doi":"arxiv-2407.19618","DOIUrl":null,"url":null,"abstract":"As service systems grow increasingly complex and dynamic, many interventions\nbecome localized, available and taking effect only in specific states. This\npaper investigates experiments with local treatments on a widely-used class of\ndynamic models, Markov Decision Processes (MDPs). Particularly, we focus on\nutilizing the local structure to improve the inference efficiency of the\naverage treatment effect. We begin by demonstrating the efficiency of classical\ninference methods, including model-based estimation and temporal difference\nlearning under a fixed policy, as well as classical A/B testing with general\ntreatments. We then introduce a variance reduction technique that exploits the\nlocal treatment structure by sharing information for states unaffected by the\ntreatment policy. Our new estimator effectively overcomes the variance lower\nbound for general treatments while matching the more stringent lower bound\nincorporating the local treatment structure. Furthermore, our estimator can\noptimally achieve a linear reduction with the number of test arms for a major\npart of the variance. Finally, we explore scenarios with perfect knowledge of\nthe control arm and design estimators that further improve inference\nefficiency.","PeriodicalId":501293,"journal":{"name":"arXiv - ECON - Econometrics","volume":"73 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Experimenting on Markov Decision Processes with Local Treatments\",\"authors\":\"Shuze Chen, David Simchi-Levi, Chonghuan Wang\",\"doi\":\"arxiv-2407.19618\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As service systems grow increasingly complex and dynamic, many interventions\\nbecome localized, available and taking effect only in specific states. This\\npaper investigates experiments with local treatments on a widely-used class of\\ndynamic models, Markov Decision Processes (MDPs). Particularly, we focus on\\nutilizing the local structure to improve the inference efficiency of the\\naverage treatment effect. We begin by demonstrating the efficiency of classical\\ninference methods, including model-based estimation and temporal difference\\nlearning under a fixed policy, as well as classical A/B testing with general\\ntreatments. We then introduce a variance reduction technique that exploits the\\nlocal treatment structure by sharing information for states unaffected by the\\ntreatment policy. Our new estimator effectively overcomes the variance lower\\nbound for general treatments while matching the more stringent lower bound\\nincorporating the local treatment structure. Furthermore, our estimator can\\noptimally achieve a linear reduction with the number of test arms for a major\\npart of the variance. Finally, we explore scenarios with perfect knowledge of\\nthe control arm and design estimators that further improve inference\\nefficiency.\",\"PeriodicalId\":501293,\"journal\":{\"name\":\"arXiv - ECON - Econometrics\",\"volume\":\"73 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - ECON - Econometrics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.19618\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - ECON - Econometrics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.19618","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

随着服务系统变得越来越复杂和动态,许多干预措施也变得局部化,只能在特定状态下使用和生效。本文研究了在一类广泛使用的动态模型--马尔可夫决策过程(Markov Decision Processes,MDPs)--上进行局部治疗的实验。我们尤其关注利用局部结构来提高平均治疗效果的推断效率。我们首先展示了经典推断方法的效率,包括固定策略下基于模型的估计和时差学习,以及使用一般治疗方法的经典 A/B 测试。然后,我们引入了一种方差缩小技术,通过共享不受治疗政策影响的状态信息来利用局部治疗结构。我们的新估计器有效地克服了一般处理方法的方差下限,同时与包含本地处理结构的更严格的下限相匹配。此外,我们的估计器还能以最佳方式实现方差的主要部分与测试臂数量的线性减少。最后,我们探讨了完全了解控制臂的情况,并设计了能进一步提高推断效率的估计器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Experimenting on Markov Decision Processes with Local Treatments
As service systems grow increasingly complex and dynamic, many interventions become localized, available and taking effect only in specific states. This paper investigates experiments with local treatments on a widely-used class of dynamic models, Markov Decision Processes (MDPs). Particularly, we focus on utilizing the local structure to improve the inference efficiency of the average treatment effect. We begin by demonstrating the efficiency of classical inference methods, including model-based estimation and temporal difference learning under a fixed policy, as well as classical A/B testing with general treatments. We then introduce a variance reduction technique that exploits the local treatment structure by sharing information for states unaffected by the treatment policy. Our new estimator effectively overcomes the variance lower bound for general treatments while matching the more stringent lower bound incorporating the local treatment structure. Furthermore, our estimator can optimally achieve a linear reduction with the number of test arms for a major part of the variance. Finally, we explore scenarios with perfect knowledge of the control arm and design estimators that further improve inference efficiency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信