具有参与约束的高效规划算法

Proceedings of the 23rd ACM Conference on Economics and Computation Pub Date : 2022-05-16 DOI:10.1145/3490486.3538280

Hanrui Zhang, Yu Cheng

{"title":"具有参与约束的高效规划算法","authors":"Hanrui Zhang, Yu Cheng","doi":"10.1145/3490486.3538280","DOIUrl":null,"url":null,"abstract":"We consider the problem of planning with participation constraints introduced in[24]. In this problem, a principal chooses actions in a Markov decision process, resulting in separate utilities for the principal and the agent. However, the agent can and will choose to end the process whenever his expected onward utility becomes negative. The principal seeks to compute and commit to a policy that maximizes her expected utility, under the constraint that the agent should always want to continue participating. We provide the first polynomial-time exact algorithm for this problem for finite-horizon settings, where previously only an additive ε-approximation algorithm was known. Our approach can also be extended to the (discounted) infinite-horizon case, for which we give an algorithm that runs in time polynomial in the size of the input and log(1/ε), and returns a policy that is optimal up to an additive error of ε.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Efficient Algorithms for Planning with Participation Constraints\",\"authors\":\"Hanrui Zhang, Yu Cheng\",\"doi\":\"10.1145/3490486.3538280\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We consider the problem of planning with participation constraints introduced in[24]. In this problem, a principal chooses actions in a Markov decision process, resulting in separate utilities for the principal and the agent. However, the agent can and will choose to end the process whenever his expected onward utility becomes negative. The principal seeks to compute and commit to a policy that maximizes her expected utility, under the constraint that the agent should always want to continue participating. We provide the first polynomial-time exact algorithm for this problem for finite-horizon settings, where previously only an additive ε-approximation algorithm was known. Our approach can also be extended to the (discounted) infinite-horizon case, for which we give an algorithm that runs in time polynomial in the size of the input and log(1/ε), and returns a policy that is optimal up to an additive error of ε.\",\"PeriodicalId\":209859,\"journal\":{\"name\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-05-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3490486.3538280\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490486.3538280","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

我们考虑[24]中引入的参与约束的规划问题。在这个问题中，委托人在马尔可夫决策过程中选择行为，导致委托人和代理人的效用不同。然而，当代理的预期效用变为负值时，代理可以并且将选择结束该过程。在代理人总是希望继续参与的约束下，委托人寻求计算并承诺使其期望效用最大化的策略。我们提供了该问题的第一个多项式时间精确算法，用于有限视界设置，其中以前只有一个已知的加性ε-近似算法。我们的方法也可以扩展到(贴现)无限视界情况，对于这种情况，我们给出了一个算法，该算法在输入大小和log(1/ε)的时间多项式中运行，并返回一个最优策略，直到可加误差为ε。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Efficient Algorithms for Planning with Participation Constraints

We consider the problem of planning with participation constraints introduced in[24]. In this problem, a principal chooses actions in a Markov decision process, resulting in separate utilities for the principal and the agent. However, the agent can and will choose to end the process whenever his expected onward utility becomes negative. The principal seeks to compute and commit to a policy that maximizes her expected utility, under the constraint that the agent should always want to continue participating. We provide the first polynomial-time exact algorithm for this problem for finite-horizon settings, where previously only an additive ε-approximation algorithm was known. Our approach can also be extended to the (discounted) infinite-horizon case, for which we give an algorithm that runs in time polynomial in the size of the input and log(1/ε), and returns a policy that is optimal up to an additive error of ε.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 23rd ACM Conference on Economics and Computation

自引率

0.00%

发文量