在不安分的多臂强盗中尽量降低成本，而不是尽量增加回报

arXiv - CS - Data Structures and Algorithms Pub Date : 2024-09-04 DOI:arxiv-2409.03071

R. Teal Witter, Lisa Hellerstein

{"title":"在不安分的多臂强盗中尽量降低成本，而不是尽量增加回报","authors":"R. Teal Witter, Lisa Hellerstein","doi":"arxiv-2409.03071","DOIUrl":null,"url":null,"abstract":"Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving\nresource constrained maximization problems. However, the formulation can be\ninappropriate for settings where the limiting constraint is a reward threshold\nrather than a budget. We introduce a constrained minimization problem for RMABs\nthat balances the goal of achieving a reward threshold while minimizing total\ncost. We show that even a bi-criteria approximate version of the problem is\nPSPACE-hard. Motivated by the hardness result, we define a decoupled problem,\nindexability and a Whittle index for the minimization problem, mirroring the\ncorresponding concepts for the maximization problem. Further, we show that the\nWhittle index for the minimization problem can easily be computed from the\nWhittle index for the maximization problem. Consequently, Whittle index results\non RMAB instances for the maximization problem give Whittle index results for\nthe minimization problem. Despite the similarities between the minimization and\nmaximization problems, solving the minimization problem is not as simple as\ntaking direct analogs of the heuristics for the maximization problem. We give\nan example of an RMAB for which the greedy Whittle index heuristic achieves the\noptimal solution for the maximization problem, while the analogous heuristic\nyields the worst possible solution for the minimization problem. In light of\nthis, we present and compare several heuristics for solving the minimization\nproblem on real and synthetic data. Our work suggests the importance of\ncontinued investigation into the minimization problem.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"95 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits\",\"authors\":\"R. Teal Witter, Lisa Hellerstein\",\"doi\":\"arxiv-2409.03071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving\\nresource constrained maximization problems. However, the formulation can be\\ninappropriate for settings where the limiting constraint is a reward threshold\\nrather than a budget. We introduce a constrained minimization problem for RMABs\\nthat balances the goal of achieving a reward threshold while minimizing total\\ncost. We show that even a bi-criteria approximate version of the problem is\\nPSPACE-hard. Motivated by the hardness result, we define a decoupled problem,\\nindexability and a Whittle index for the minimization problem, mirroring the\\ncorresponding concepts for the maximization problem. Further, we show that the\\nWhittle index for the minimization problem can easily be computed from the\\nWhittle index for the maximization problem. Consequently, Whittle index results\\non RMAB instances for the maximization problem give Whittle index results for\\nthe minimization problem. Despite the similarities between the minimization and\\nmaximization problems, solving the minimization problem is not as simple as\\ntaking direct analogs of the heuristics for the maximization problem. We give\\nan example of an RMAB for which the greedy Whittle index heuristic achieves the\\noptimal solution for the maximization problem, while the analogous heuristic\\nyields the worst possible solution for the minimization problem. In light of\\nthis, we present and compare several heuristics for solving the minimization\\nproblem on real and synthetic data. Our work suggests the importance of\\ncontinued investigation into the minimization problem.\",\"PeriodicalId\":501525,\"journal\":{\"name\":\"arXiv - CS - Data Structures and Algorithms\",\"volume\":\"95 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Data Structures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

无休止的多臂强盗（RMABs）为解决资源受限的最大化问题提供了一个强大的框架。然而，这种表述可能不适合限制约束是奖励阈值而不是预算的情况。我们为 RMAB 引入了一个受限最小化问题，该问题在实现奖励阈值的同时兼顾了总成本最小化的目标。我们的研究表明，即使是该问题的双标准近似版本也很难解决。受硬度结果的启发，我们为最小化问题定义了一个解耦问题、可索引性和惠特尔索引，这与最大化问题的相应概念如出一辙。此外，我们还证明，最小化问题的惠特尔指数可以很容易地从最大化问题的惠特尔指数计算出来。因此，最大化问题中 RMAB 实例的惠特尔指数结果可以给出最小化问题的惠特尔指数结果。尽管最小化问题和最大化问题有相似之处，但解决最小化问题并不像最大化问题的启发式方法那样简单。我们举了一个 RMAB 的例子，在最大化问题中，贪婪惠特尔指数启发式得到了最优解，而在最小化问题中，类似的启发式却得到了最差解。有鉴于此，我们提出并比较了几种在真实数据和合成数据上求解最小化问题的启发式。我们的工作表明，继续研究最小化问题非常重要。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits

Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving resource constrained maximization problems. However, the formulation can be inappropriate for settings where the limiting constraint is a reward threshold rather than a budget. We introduce a constrained minimization problem for RMABs that balances the goal of achieving a reward threshold while minimizing total cost. We show that even a bi-criteria approximate version of the problem is PSPACE-hard. Motivated by the hardness result, we define a decoupled problem, indexability and a Whittle index for the minimization problem, mirroring the corresponding concepts for the maximization problem. Further, we show that the Whittle index for the minimization problem can easily be computed from the Whittle index for the maximization problem. Consequently, Whittle index results on RMAB instances for the maximization problem give Whittle index results for the minimization problem. Despite the similarities between the minimization and maximization problems, solving the minimization problem is not as simple as taking direct analogs of the heuristics for the maximization problem. We give an example of an RMAB for which the greedy Whittle index heuristic achieves the optimal solution for the maximization problem, while the analogous heuristic yields the worst possible solution for the minimization problem. In light of this, we present and compare several heuristics for solving the minimization problem on real and synthetic data. Our work suggests the importance of continued investigation into the minimization problem.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

arXiv - CS - Data Structures and Algorithms

自引率

0.00%

发文量