Federated Multiarmed Bandits Under Byzantine Attacks

Artun Saday;İlker Demirel;Yiğit Yıldırım;Cem Tekin
{"title":"Federated Multiarmed Bandits Under Byzantine Attacks","authors":"Artun Saday;İlker Demirel;Yiğit Yıldırım;Cem Tekin","doi":"10.1109/TAI.2024.3524954","DOIUrl":null,"url":null,"abstract":"Multiarmed bandits (MAB) is a sequential decision-making model in which the learner controls the trade-off between exploration and exploitation to maximize its cumulative reward. Federated multiarmed bandits (FMAB) is an emerging framework where a cohort of learners with heterogeneous local models play an MAB game and communicate their aggregated feedback to a server to learn a globally optimal arm. Two key hurdles in FMAB are communication-efficient learning and resilience to adversarial attacks. To address these issues, we study the FMAB problem in the presence of Byzantine clients who can send false model updates threatening the learning process. We analyze the sample complexity and the regret of <inline-formula><tex-math>$\\beta$</tex-math></inline-formula>-optimal arm identification. We borrow tools from robust statistics and propose a median-of-means (MoM)-based online algorithm, Fed-MoM-UCB, to cope with Byzantine clients. In particular, we show that if the Byzantine clients constitute less than half of the cohort, the cumulative regret with respect to <inline-formula><tex-math>$\\beta$</tex-math></inline-formula>-optimal arms is bounded over time with high probability, showcasing both communication efficiency and Byzantine resilience. We analyze the interplay between the algorithm parameters, a discernibility margin, regret, communication cost, and the arms’ suboptimality gaps. We demonstrate Fed-MoM-UCB's effectiveness against the baselines in the presence of Byzantine attacks via experiments.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 6","pages":"1488-1501"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10820861/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Multiarmed bandits (MAB) is a sequential decision-making model in which the learner controls the trade-off between exploration and exploitation to maximize its cumulative reward. Federated multiarmed bandits (FMAB) is an emerging framework where a cohort of learners with heterogeneous local models play an MAB game and communicate their aggregated feedback to a server to learn a globally optimal arm. Two key hurdles in FMAB are communication-efficient learning and resilience to adversarial attacks. To address these issues, we study the FMAB problem in the presence of Byzantine clients who can send false model updates threatening the learning process. We analyze the sample complexity and the regret of $\beta$-optimal arm identification. We borrow tools from robust statistics and propose a median-of-means (MoM)-based online algorithm, Fed-MoM-UCB, to cope with Byzantine clients. In particular, we show that if the Byzantine clients constitute less than half of the cohort, the cumulative regret with respect to $\beta$-optimal arms is bounded over time with high probability, showcasing both communication efficiency and Byzantine resilience. We analyze the interplay between the algorithm parameters, a discernibility margin, regret, communication cost, and the arms’ suboptimality gaps. We demonstrate Fed-MoM-UCB's effectiveness against the baselines in the presence of Byzantine attacks via experiments.
拜占庭攻击下的联合多武装土匪
多武装盗匪(Multiarmed bandits, MAB)是一种顺序决策模型,在该模型中,学习者控制探索与开发之间的权衡,以最大化其累积回报。Federated multiarmed bandits (FMAB)是一种新兴的框架,其中一群具有异质局部模型的学习者进行MAB游戏,并将他们的汇总反馈传递给服务器以学习全局最优手臂。FMAB的两个关键障碍是有效的沟通学习和对抗性攻击的恢复能力。为了解决这些问题,我们研究了拜占庭客户端存在的FMAB问题,拜占庭客户端可能会发送错误的模型更新,威胁到学习过程。我们分析了$\beta$-最优臂识别的样本复杂度和遗憾。我们从稳健统计中借用工具,提出了一种基于均值中位数(MoM)的在线算法,Fed-MoM-UCB,以应对拜占庭式客户。特别是,我们表明,如果拜占庭客户占队列的一半以下,则相对于$\beta$-最优臂的累积遗憾以高概率随时间有界,显示了通信效率和拜占庭弹性。我们分析了算法参数、可分辨余量、遗憾、通信成本和武器次最优性差距之间的相互作用。我们通过实验证明了Fed-MoM-UCB在存在拜占庭攻击的情况下对基线的有效性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
7.70
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信