{"title":"在不安分的多臂强盗中尽量降低成本,而不是尽量增加回报","authors":"R. Teal Witter, Lisa Hellerstein","doi":"arxiv-2409.03071","DOIUrl":null,"url":null,"abstract":"Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving\nresource constrained maximization problems. However, the formulation can be\ninappropriate for settings where the limiting constraint is a reward threshold\nrather than a budget. We introduce a constrained minimization problem for RMABs\nthat balances the goal of achieving a reward threshold while minimizing total\ncost. We show that even a bi-criteria approximate version of the problem is\nPSPACE-hard. Motivated by the hardness result, we define a decoupled problem,\nindexability and a Whittle index for the minimization problem, mirroring the\ncorresponding concepts for the maximization problem. Further, we show that the\nWhittle index for the minimization problem can easily be computed from the\nWhittle index for the maximization problem. Consequently, Whittle index results\non RMAB instances for the maximization problem give Whittle index results for\nthe minimization problem. Despite the similarities between the minimization and\nmaximization problems, solving the minimization problem is not as simple as\ntaking direct analogs of the heuristics for the maximization problem. We give\nan example of an RMAB for which the greedy Whittle index heuristic achieves the\noptimal solution for the maximization problem, while the analogous heuristic\nyields the worst possible solution for the minimization problem. In light of\nthis, we present and compare several heuristics for solving the minimization\nproblem on real and synthetic data. Our work suggests the importance of\ncontinued investigation into the minimization problem.","PeriodicalId":501525,"journal":{"name":"arXiv - CS - Data Structures and Algorithms","volume":"95 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits\",\"authors\":\"R. Teal Witter, Lisa Hellerstein\",\"doi\":\"arxiv-2409.03071\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving\\nresource constrained maximization problems. However, the formulation can be\\ninappropriate for settings where the limiting constraint is a reward threshold\\nrather than a budget. We introduce a constrained minimization problem for RMABs\\nthat balances the goal of achieving a reward threshold while minimizing total\\ncost. We show that even a bi-criteria approximate version of the problem is\\nPSPACE-hard. Motivated by the hardness result, we define a decoupled problem,\\nindexability and a Whittle index for the minimization problem, mirroring the\\ncorresponding concepts for the maximization problem. Further, we show that the\\nWhittle index for the minimization problem can easily be computed from the\\nWhittle index for the maximization problem. Consequently, Whittle index results\\non RMAB instances for the maximization problem give Whittle index results for\\nthe minimization problem. Despite the similarities between the minimization and\\nmaximization problems, solving the minimization problem is not as simple as\\ntaking direct analogs of the heuristics for the maximization problem. We give\\nan example of an RMAB for which the greedy Whittle index heuristic achieves the\\noptimal solution for the maximization problem, while the analogous heuristic\\nyields the worst possible solution for the minimization problem. In light of\\nthis, we present and compare several heuristics for solving the minimization\\nproblem on real and synthetic data. Our work suggests the importance of\\ncontinued investigation into the minimization problem.\",\"PeriodicalId\":501525,\"journal\":{\"name\":\"arXiv - CS - Data Structures and Algorithms\",\"volume\":\"95 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Data Structures and Algorithms\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.03071\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Data Structures and Algorithms","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.03071","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Minimizing Cost Rather Than Maximizing Reward in Restless Multi-Armed Bandits
Restless Multi-Armed Bandits (RMABs) offer a powerful framework for solving
resource constrained maximization problems. However, the formulation can be
inappropriate for settings where the limiting constraint is a reward threshold
rather than a budget. We introduce a constrained minimization problem for RMABs
that balances the goal of achieving a reward threshold while minimizing total
cost. We show that even a bi-criteria approximate version of the problem is
PSPACE-hard. Motivated by the hardness result, we define a decoupled problem,
indexability and a Whittle index for the minimization problem, mirroring the
corresponding concepts for the maximization problem. Further, we show that the
Whittle index for the minimization problem can easily be computed from the
Whittle index for the maximization problem. Consequently, Whittle index results
on RMAB instances for the maximization problem give Whittle index results for
the minimization problem. Despite the similarities between the minimization and
maximization problems, solving the minimization problem is not as simple as
taking direct analogs of the heuristics for the maximization problem. We give
an example of an RMAB for which the greedy Whittle index heuristic achieves the
optimal solution for the maximization problem, while the analogous heuristic
yields the worst possible solution for the minimization problem. In light of
this, we present and compare several heuristics for solving the minimization
problem on real and synthetic data. Our work suggests the importance of
continued investigation into the minimization problem.