Commitment Without Regrets: Online Learning in Stackelberg Security Games

Proceedings of the Sixteenth ACM Conference on Economics and Computation Pub Date : 2015-06-15 DOI:10.1145/2764468.2764478

Maria-Florina Balcan, Avrim Blum, Nika Haghtalab, A. Procaccia

引用次数: 104

Abstract

In a Stackelberg Security Game, a defender commits to a randomized deployment of security resources, and an attacker best-responds by attacking a target that maximizes his utility. While algorithms for computing an optimal strategy for the defender to commit to have had a striking real-world impact, deployed applications require significant information about potential attackers, leading to inefficiencies. We address this problem via an online learning approach. We are interested in algorithms that prescribe a randomized strategy for the defender at each step against an adversarially chosen sequence of attackers, and obtain feedback on their choices (observing either the current attacker type or merely which target was attacked). We design no-regret algorithms whose regret (when compared to the best fixed strategy in hindsight) is polynomial in the parameters of the game, and sublinear in the number of times steps.

查看原文本刊更多论文

承诺没有遗憾:在线学习在Stackelberg安全游戏

在Stackelberg安全游戏中，防御者承诺随机部署安全资源，而攻击者的最佳反应是攻击目标，使其效用最大化。虽然为防御者计算最佳策略的算法具有显著的现实影响，但已部署的应用程序需要有关潜在攻击者的重要信息，从而导致效率低下。我们通过在线学习的方式来解决这个问题。我们感兴趣的算法是在每个步骤中为防御者规定随机策略，以对抗敌对选择的攻击者序列，并获得他们选择的反馈(观察当前攻击者类型或仅仅是攻击目标)。我们设计了无遗憾算法，其遗憾(与事后最佳固定策略相比)是游戏参数的多项式，并且是次线性的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Sixteenth ACM Conference on Economics and Computation

自引率

0.00%

发文量