Virtues of Patience in Strategic Queuing Systems

Proceedings of the 22nd ACM Conference on Economics and Computation Pub Date : 2020-11-20 DOI:10.1145/3465456.3467640

J. Gaitonde, É. Tardos

{"title":"Virtues of Patience in Strategic Queuing Systems","authors":"J. Gaitonde, É. Tardos","doi":"10.1145/3465456.3467640","DOIUrl":null,"url":null,"abstract":"We consider the problem of selfish agents in discrete-time queuing systems, where competitive queues try to get their packets served. In this model, a queue gets to send a packet each step to one of the servers, which will attempt to serve the oldest arriving packet, and unprocessed packets are returned to each queue. We model this as a repeated game where queues compete for the capacity of the servers, but where the state of the game evolves as the length of each queue varies, resulting in a highly dependent random process. In classical work for learning in repeated games, the learners evaluate the outcome of their strategy in each step---in our context, this means that queues estimate their success probability at each server. Earlier work by the authors [in EC'20] shows that with no-regret learners, the system needs twice the capacity as would be required in the coordinated setting to ensure queue lengths remain stable despite the selfish behavior of the queues. In this paper, we demonstrate that this myopic way of evaluating outcomes is suboptimal: if more patient queues choose strategies that selfishly maximize their long-run success rate, stability can be ensured with just e/e-1 ~1.58 times extra capacity, strictly better than what is possible assuming the no-regret property. As these systems induce highly dependent random processes, our analysis draws heavily on techniques from the theory of stochastic processes to establish various game-theoretic properties of these systems. Though these systems are random even under fixed stationary policies by the queues, we show using careful probabilistic arguments that surprisingly, under such fixed policies, these systems have essentially deterministic and explicit asymptotic behavior. We show that the growth rate of a set can be written as the ratio of a submodular and modular function, and use the resulting explicit description to show that the subsets of queues with largest growth rate are closed under union and non-disjoint intersections, which we use in turn to prove the claimed sharp bicriteria result for the equilibria of the resulting system. Our equilibrium analysis relies on a novel deformation argument towards a more analyzable solution that is quite different from classical price of anarchy bounds. While the intermediate points in this deformation will not be Nash, the structure will ensure the relevant constraints and incentives similarly hold to establish monotonicity along this continuous path.","PeriodicalId":395676,"journal":{"name":"Proceedings of the 22nd ACM Conference on Economics and Computation","volume":"95 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3465456.3467640","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

We consider the problem of selfish agents in discrete-time queuing systems, where competitive queues try to get their packets served. In this model, a queue gets to send a packet each step to one of the servers, which will attempt to serve the oldest arriving packet, and unprocessed packets are returned to each queue. We model this as a repeated game where queues compete for the capacity of the servers, but where the state of the game evolves as the length of each queue varies, resulting in a highly dependent random process. In classical work for learning in repeated games, the learners evaluate the outcome of their strategy in each step---in our context, this means that queues estimate their success probability at each server. Earlier work by the authors [in EC'20] shows that with no-regret learners, the system needs twice the capacity as would be required in the coordinated setting to ensure queue lengths remain stable despite the selfish behavior of the queues. In this paper, we demonstrate that this myopic way of evaluating outcomes is suboptimal: if more patient queues choose strategies that selfishly maximize their long-run success rate, stability can be ensured with just e/e-1 ~1.58 times extra capacity, strictly better than what is possible assuming the no-regret property. As these systems induce highly dependent random processes, our analysis draws heavily on techniques from the theory of stochastic processes to establish various game-theoretic properties of these systems. Though these systems are random even under fixed stationary policies by the queues, we show using careful probabilistic arguments that surprisingly, under such fixed policies, these systems have essentially deterministic and explicit asymptotic behavior. We show that the growth rate of a set can be written as the ratio of a submodular and modular function, and use the resulting explicit description to show that the subsets of queues with largest growth rate are closed under union and non-disjoint intersections, which we use in turn to prove the claimed sharp bicriteria result for the equilibria of the resulting system. Our equilibrium analysis relies on a novel deformation argument towards a more analyzable solution that is quite different from classical price of anarchy bounds. While the intermediate points in this deformation will not be Nash, the structure will ensure the relevant constraints and incentives similarly hold to establish monotonicity along this continuous path.

查看原文本刊更多论文

策略排队系统中耐心的优点

我们考虑离散时间排队系统中的自私代理问题，其中竞争队列试图使其数据包得到服务。在此模型中，队列每一步向其中一个服务器发送一个数据包，该服务器将尝试为最早到达的数据包提供服务，而未处理的数据包将返回给每个队列。我们将其建模为一个重复的游戏，其中队列竞争服务器的容量，但游戏的状态随着每个队列的长度变化而变化，从而导致高度依赖的随机过程。在经典的重复博弈学习工作中，学习者在每一步中评估他们的策略的结果——在我们的上下文中，这意味着队列估计他们在每个服务器上的成功概率。作者[在EC'20]的早期工作表明，使用无遗憾学习器，系统需要两倍于协调设置所需的容量，以确保队列长度保持稳定，尽管队列有自私的行为。在本文中，我们证明了这种短视的评估结果的方法是次优的:如果更多的患者队列选择自私地最大化其长期成功率的策略，则只需e/e-1 ~1.58倍的额外容量即可确保稳定性，严格优于假设无后悔属性的可能性。由于这些系统诱导高度依赖的随机过程，我们的分析大量利用随机过程理论的技术来建立这些系统的各种博弈论性质。尽管这些系统即使在队列的固定平稳策略下也是随机的，但我们使用仔细的概率论证表明，在这种固定策略下，这些系统本质上具有确定性和显式的渐近行为。我们证明了一个集合的增长率可以写为子模函数和模函数的比值，并利用所得到的显式描述证明了增长率最大的队列子集在并交和非不相交下是封闭的，进而证明了所得到的系统平衡点的尖锐双准则结果。我们的平衡分析依赖于一种新的变形论证，它指向一个更易于分析的解，这与经典的无政府状态边界的价格有很大的不同。虽然这种变形中的中间点不会是纳什，但该结构将确保相关的约束和激励同样适用于沿着这条连续路径建立单调性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 22nd ACM Conference on Economics and Computation

自引率

0.00%

发文量