Federated Multiarmed Bandits Under Byzantine Attacks

IEEE transactions on artificial intelligence Pub Date : 2025-01-03 DOI:10.1109/TAI.2024.3524954

Artun Saday;İlker Demirel;Yiğit Yıldırım;Cem Tekin

{"title":"Federated Multiarmed Bandits Under Byzantine Attacks","authors":"Artun Saday;İlker Demirel;Yiğit Yıldırım;Cem Tekin","doi":"10.1109/TAI.2024.3524954","DOIUrl":null,"url":null,"abstract":"Multiarmed bandits (MAB) is a sequential decision-making model in which the learner controls the trade-off between exploration and exploitation to maximize its cumulative reward. Federated multiarmed bandits (FMAB) is an emerging framework where a cohort of learners with heterogeneous local models play an MAB game and communicate their aggregated feedback to a server to learn a globally optimal arm. Two key hurdles in FMAB are communication-efficient learning and resilience to adversarial attacks. To address these issues, we study the FMAB problem in the presence of Byzantine clients who can send false model updates threatening the learning process. We analyze the sample complexity and the regret of <inline-formula><tex-math>$\\beta$</tex-math></inline-formula>-optimal arm identification. We borrow tools from robust statistics and propose a median-of-means (MoM)-based online algorithm, Fed-MoM-UCB, to cope with Byzantine clients. In particular, we show that if the Byzantine clients constitute less than half of the cohort, the cumulative regret with respect to <inline-formula><tex-math>$\\beta$</tex-math></inline-formula>-optimal arms is bounded over time with high probability, showcasing both communication efficiency and Byzantine resilience. We analyze the interplay between the algorithm parameters, a discernibility margin, regret, communication cost, and the arms’ suboptimality gaps. We demonstrate Fed-MoM-UCB's effectiveness against the baselines in the presence of Byzantine attacks via experiments.","PeriodicalId":73305,"journal":{"name":"IEEE transactions on artificial intelligence","volume":"6 6","pages":"1488-1501"},"PeriodicalIF":0.0000,"publicationDate":"2025-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE transactions on artificial intelligence","FirstCategoryId":"1085","ListUrlMain":"https://ieeexplore.ieee.org/document/10820861/","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Multiarmed bandits (MAB) is a sequential decision-making model in which the learner controls the trade-off between exploration and exploitation to maximize its cumulative reward. Federated multiarmed bandits (FMAB) is an emerging framework where a cohort of learners with heterogeneous local models play an MAB game and communicate their aggregated feedback to a server to learn a globally optimal arm. Two key hurdles in FMAB are communication-efficient learning and resilience to adversarial attacks. To address these issues, we study the FMAB problem in the presence of Byzantine clients who can send false model updates threatening the learning process. We analyze the sample complexity and the regret of

$\beta$

-optimal arm identification. We borrow tools from robust statistics and propose a median-of-means (MoM)-based online algorithm, Fed-MoM-UCB, to cope with Byzantine clients. In particular, we show that if the Byzantine clients constitute less than half of the cohort, the cumulative regret with respect to

$\beta$

-optimal arms is bounded over time with high probability, showcasing both communication efficiency and Byzantine resilience. We analyze the interplay between the algorithm parameters, a discernibility margin, regret, communication cost, and the arms’ suboptimality gaps. We demonstrate Fed-MoM-UCB's effectiveness against the baselines in the presence of Byzantine attacks via experiments.

查看原文本刊更多论文

拜占庭攻击下的联合多武装土匪

多武装盗匪（Multiarmed bandits， MAB）是一种顺序决策模型，在该模型中，学习者控制探索与开发之间的权衡，以最大化其累积回报。Federated multiarmed bandits （FMAB）是一种新兴的框架，其中一群具有异质局部模型的学习者进行MAB游戏，并将他们的汇总反馈传递给服务器以学习全局最优手臂。FMAB的两个关键障碍是有效的沟通学习和对抗性攻击的恢复能力。为了解决这些问题，我们研究了拜占庭客户端存在的FMAB问题，拜占庭客户端可能会发送错误的模型更新，威胁到学习过程。我们分析了$\beta$-最优臂识别的样本复杂度和遗憾。我们从稳健统计中借用工具，提出了一种基于均值中位数（MoM）的在线算法，Fed-MoM-UCB，以应对拜占庭式客户。特别是，我们表明，如果拜占庭客户占队列的一半以下，则相对于$\beta$-最优臂的累积遗憾以高概率随时间有界，显示了通信效率和拜占庭弹性。我们分析了算法参数、可分辨余量、遗憾、通信成本和武器次最优性差距之间的相互作用。我们通过实验证明了Fed-MoM-UCB在存在拜占庭攻击的情况下对基线的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE transactions on artificial intelligence

CiteScore

7.70

自引率

0.00%

发文量