对抗性群体线性强盗及其在协同边缘推理中的应用

Yin-Hae Huang, Letian Zhang, J. Xu
{"title":"对抗性群体线性强盗及其在协同边缘推理中的应用","authors":"Yin-Hae Huang, Letian Zhang, J. Xu","doi":"10.1109/INFOCOM53939.2023.10228900","DOIUrl":null,"url":null,"abstract":"Multi-armed bandits is a classical sequential decision-making under uncertainty problem. The majority of existing works study bandits problems in either the stochastic reward regime or the adversarial reward regime, but the intersection of these two regimes is much less investigated. In this paper, we study a new bandits problem, called adversarial group linear bandits (AGLB), that features reward generation as a joint outcome of both the stochastic process and the adversarial behavior. In particular, the reward that the learner receives is not only a noisy linear function of the arm that the learner selects within a group but also depends on the group-level attack decision by the adversary. Such problems are present in many real-world applications, e.g., collaborative edge inference and multi-site online ad placement. To combat the uncertainty in the coupled stochastic and adversarial rewards, we develop a new bandits algorithm, called EXPUCB, which marries the classical LinUCB and EXP3 algorithms, and prove its sublinear regret. We apply EXPUCB to the collaborative edge inference problem and evaluate its performance. Extensive simulation results verify the superior learning ability of EXPUCB under coupled stochastic and adversarial rewards.","PeriodicalId":387707,"journal":{"name":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","volume":"88 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Adversarial Group Linear Bandits and Its Application to Collaborative Edge Inference\",\"authors\":\"Yin-Hae Huang, Letian Zhang, J. Xu\",\"doi\":\"10.1109/INFOCOM53939.2023.10228900\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multi-armed bandits is a classical sequential decision-making under uncertainty problem. The majority of existing works study bandits problems in either the stochastic reward regime or the adversarial reward regime, but the intersection of these two regimes is much less investigated. In this paper, we study a new bandits problem, called adversarial group linear bandits (AGLB), that features reward generation as a joint outcome of both the stochastic process and the adversarial behavior. In particular, the reward that the learner receives is not only a noisy linear function of the arm that the learner selects within a group but also depends on the group-level attack decision by the adversary. Such problems are present in many real-world applications, e.g., collaborative edge inference and multi-site online ad placement. To combat the uncertainty in the coupled stochastic and adversarial rewards, we develop a new bandits algorithm, called EXPUCB, which marries the classical LinUCB and EXP3 algorithms, and prove its sublinear regret. We apply EXPUCB to the collaborative edge inference problem and evaluate its performance. Extensive simulation results verify the superior learning ability of EXPUCB under coupled stochastic and adversarial rewards.\",\"PeriodicalId\":387707,\"journal\":{\"name\":\"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications\",\"volume\":\"88 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-05-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/INFOCOM53939.2023.10228900\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE INFOCOM 2023 - IEEE Conference on Computer Communications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/INFOCOM53939.2023.10228900","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

多武装盗匪是典型的不确定问题下的顺序决策。现有的大多数研究都是在随机奖励制度或对抗奖励制度下研究强盗问题,但对这两种制度的交集的研究却很少。在本文中,我们研究了一种新的强盗问题,称为对抗群体线性强盗(AGLB),其特征是奖励生成是随机过程和对抗行为的共同结果。特别是,学习者获得的奖励不仅是学习者在群体中选择的手臂的噪声线性函数,而且还取决于对手的群体级攻击决策。这样的问题存在于许多现实世界的应用中,例如,协作边缘推理和多站点在线广告放置。为了克服随机和对抗性奖励耦合中的不确定性,我们开发了一种新的强盗算法,称为EXPUCB,它结合了经典的LinUCB和EXP3算法,并证明了它的次线性后悔。我们将EXPUCB应用于协同边缘推理问题,并对其性能进行了评价。大量的仿真结果验证了EXPUCB在随机和对抗耦合奖励下的优越学习能力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Adversarial Group Linear Bandits and Its Application to Collaborative Edge Inference
Multi-armed bandits is a classical sequential decision-making under uncertainty problem. The majority of existing works study bandits problems in either the stochastic reward regime or the adversarial reward regime, but the intersection of these two regimes is much less investigated. In this paper, we study a new bandits problem, called adversarial group linear bandits (AGLB), that features reward generation as a joint outcome of both the stochastic process and the adversarial behavior. In particular, the reward that the learner receives is not only a noisy linear function of the arm that the learner selects within a group but also depends on the group-level attack decision by the adversary. Such problems are present in many real-world applications, e.g., collaborative edge inference and multi-site online ad placement. To combat the uncertainty in the coupled stochastic and adversarial rewards, we develop a new bandits algorithm, called EXPUCB, which marries the classical LinUCB and EXP3 algorithms, and prove its sublinear regret. We apply EXPUCB to the collaborative edge inference problem and evaluate its performance. Extensive simulation results verify the superior learning ability of EXPUCB under coupled stochastic and adversarial rewards.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信