在不安分的多臂强盗中尽量降低成本,而不是尽量增加回报

R. Teal Witter, Lisa Hellerstein
{"title":"在不安分的多臂强盗中尽量降低成本,而不是尽量增加回报","authors":"R. Teal Witter, Lisa Hellerstein","doi":"arxiv-2409.03071","DOIUrl":null,"url":null,"abstract":"Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving\nresource constrained maximization problems. However, the formulation can be\ninappropriate for settings where the limiting constraint is a reward threshold\nrather than a budget. We introduce a constrained minimization problem for RMABs\nthat balances the goal of achieving a reward threshold while minimizing total\ncost. We show that even a bi-criteria approximate version of the problem is\nPSPACE-hard. Motivated by the hardness result, we define a decoupled problem,\nindexability and a Whittle index for the minimization problem, mirroring the\ncorresponding concepts for the maximization problem. Further, we show that the\nWhittle index for the minimization problem can easily be computed from the\nWhittle index for the maximization problem. Consequently, Whittle index results\non RMAB instances for the maximization problem give Whittle index results for\nthe minimization problem. Despite the similarities between the minimization and\nmaximization problems, solving the minimization problem is not as simple as\ntaking direct analogs of the heuristics for the maximization problem. We give\nan example of an RMAB for which the greedy Whittle index heuristic achieves the\noptimal solution for the maximization problem, while the analogous heuristic\nyields the worst possible solution for the minimization problem. In light of\nthis, we present and compare several heuristics for solving the minimization\nproblem on real and synthetic data. Our work suggests the importance of\ncontinued investigation into the minimization problem.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"95 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits\",\"authors\":\"R. Teal Witter, Lisa Hellerstein\",\"doi\":\"arxiv-2409.03071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving\\nresource constrained maximization problems. However, the formulation can be\\ninappropriate for settings where the limiting constraint is a reward threshold\\nrather than a budget. We introduce a constrained minimization problem for RMABs\\nthat balances the goal of achieving a reward threshold while minimizing total\\ncost. We show that even a bi-criteria approximate version of the problem is\\nPSPACE-hard. Motivated by the hardness result, we define a decoupled problem,\\nindexability and a Whittle index for the minimization problem, mirroring the\\ncorresponding concepts for the maximization problem. Further, we show that the\\nWhittle index for the minimization problem can easily be computed from the\\nWhittle index for the maximization problem. Consequently, Whittle index results\\non RMAB instances for the maximization problem give Whittle index results for\\nthe minimization problem. Despite the similarities between the minimization and\\nmaximization problems, solving the minimization problem is not as simple as\\ntaking direct analogs of the heuristics for the maximization problem. We give\\nan example of an RMAB for which the greedy Whittle index heuristic achieves the\\noptimal solution for the maximization problem, while the analogous heuristic\\nyields the worst possible solution for the minimization problem. In light of\\nthis, we present and compare several heuristics for solving the minimization\\nproblem on real and synthetic data. Our work suggests the importance of\\ncontinued investigation into the minimization problem.\",\"PeriodicalId\":501525,\"journal\":{\"name\":\"arXiv - CS - Data Structures and Algorithms\",\"volume\":\"95 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Data Structures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

无休止的多臂强盗(RMABs)为解决资源受限的最大化问题提供了一个强大的框架。然而,这种表述可能不适合限制约束是奖励阈值而不是预算的情况。我们为 RMAB 引入了一个受限最小化问题,该问题在实现奖励阈值的同时兼顾了总成本最小化的目标。我们的研究表明,即使是该问题的双标准近似版本也很难解决。受硬度结果的启发,我们为最小化问题定义了一个解耦问题、可索引性和惠特尔索引,这与最大化问题的相应概念如出一辙。此外,我们还证明,最小化问题的惠特尔指数可以很容易地从最大化问题的惠特尔指数计算出来。因此,最大化问题中 RMAB 实例的惠特尔指数结果可以给出最小化问题的惠特尔指数结果。尽管最小化问题和最大化问题有相似之处,但解决最小化问题并不像最大化问题的启发式方法那样简单。我们举了一个 RMAB 的例子,在最大化问题中,贪婪惠特尔指数启发式得到了最优解,而在最小化问题中,类似的启发式却得到了最差解。有鉴于此,我们提出并比较了几种在真实数据和合成数据上求解最小化问题的启发式。我们的工作表明,继续研究最小化问题非常重要。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits
Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving resource constrained maximization problems. However, the formulation can be inappropriate for settings where the limiting constraint is a reward threshold rather than a budget. We introduce a constrained minimization problem for RMABs that balances the goal of achieving a reward threshold while minimizing total cost. We show that even a bi-criteria approximate version of the problem is PSPACE-hard. Motivated by the hardness result, we define a decoupled problem, indexability and a Whittle index for the minimization problem, mirroring the corresponding concepts for the maximization problem. Further, we show that the Whittle index for the minimization problem can easily be computed from the Whittle index for the maximization problem. Consequently, Whittle index results on RMAB instances for the maximization problem give Whittle index results for the minimization problem. Despite the similarities between the minimization and maximization problems, solving the minimization problem is not as simple as taking direct analogs of the heuristics for the maximization problem. We give an example of an RMAB for which the greedy Whittle index heuristic achieves the optimal solution for the maximization problem, while the analogous heuristic yields the worst possible solution for the minimization problem. In light of this, we present and compare several heuristics for solving the minimization problem on real and synthetic data. Our work suggests the importance of continued investigation into the minimization problem.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信