XMAM:X-raying models with a matrix to reveal backdoor attacks for federated learning

IF 7.5 2区计算机科学 Q1 TELECOMMUNICATIONS

Digital Communications and Networks Pub Date : 2024-08-01 DOI:10.1016/j.dcan.2023.01.017

Jianyi Zhang , Fangjiao Zhang , Qichao Jin , Zhiqiang Wang , Xiaodong Lin , Xiali Hei

{"title":"XMAM:X-raying models with a matrix to reveal backdoor attacks for federated learning","authors":"Jianyi Zhang , Fangjiao Zhang , Qichao Jin , Zhiqiang Wang , Xiaodong Lin , Xiali Hei","doi":"10.1016/j.dcan.2023.01.017","DOIUrl":null,"url":null,"abstract":"<div>Federated Learning (FL), a burgeoning technology, has received increasing attention due to its privacy protection capability. However, the base algorithm FedAvg is vulnerable when it suffers from so-called backdoor attacks. Former researchers proposed several robust aggregation methods. Unfortunately, due to the hidden characteristic of backdoor attacks, many of these aggregation methods are unable to defend against backdoor attacks. What's more, the attackers recently have proposed some hiding methods that further improve backdoor attacks' stealthiness, making all the existing robust aggregation methods fail.To tackle the threat of backdoor attacks, we propose a new aggregation method, X-raying Models with A Matrix (XMAM), to reveal the malicious local model updates submitted by the backdoor attackers. Since we observe that the output of the Softmax layer exhibits distinguishable patterns between malicious and benign updates, unlike the existing aggregation algorithms, we focus on the Softmax layer's output in which the backdoor attackers are difficult to hide their malicious behavior. Specifically, like medical X-ray examinations, we investigate the collected local model updates by using a matrix as an input to get their Softmax layer's outputs. Then, we preclude updates whose outputs are abnormal by clustering. Without any training dataset in the server, the extensive evaluations show that our XMAM can effectively distinguish malicious local model updates from benign ones. For instance, when other methods fail to defend against the backdoor attacks at no more than 20% malicious clients, our method can tolerate 45% malicious clients in the black-box mode and about 30% in Projected Gradient Descent (PGD) mode. Besides, under adaptive attacks, the results demonstrate that XMAM can still complete the global model training task even when there are 40% malicious clients. Finally, we analyze our method's screening complexity and compare the real screening time with other methods. The results show that XMAM is about 10–10000 times faster than the existing methods.</div>","PeriodicalId":48631,"journal":{"name":"Digital Communications and Networks","volume":"10 4","pages":"Pages 1154-1167"},"PeriodicalIF":7.5000,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2352864823000305/pdfft?md5=0dddf5e58bbb78a91191743e84018831&pid=1-s2.0-S2352864823000305-main.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Communications and Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2352864823000305","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

Federated Learning (FL), a burgeoning technology, has received increasing attention due to its privacy protection capability. However, the base algorithm FedAvg is vulnerable when it suffers from so-called backdoor attacks. Former researchers proposed several robust aggregation methods. Unfortunately, due to the hidden characteristic of backdoor attacks, many of these aggregation methods are unable to defend against backdoor attacks. What's more, the attackers recently have proposed some hiding methods that further improve backdoor attacks' stealthiness, making all the existing robust aggregation methods fail.

To tackle the threat of backdoor attacks, we propose a new aggregation method, X-raying Models with A Matrix (XMAM), to reveal the malicious local model updates submitted by the backdoor attackers. Since we observe that the output of the Softmax layer exhibits distinguishable patterns between malicious and benign updates, unlike the existing aggregation algorithms, we focus on the Softmax layer's output in which the backdoor attackers are difficult to hide their malicious behavior. Specifically, like medical X-ray examinations, we investigate the collected local model updates by using a matrix as an input to get their Softmax layer's outputs. Then, we preclude updates whose outputs are abnormal by clustering. Without any training dataset in the server, the extensive evaluations show that our XMAM can effectively distinguish malicious local model updates from benign ones. For instance, when other methods fail to defend against the backdoor attacks at no more than 20% malicious clients, our method can tolerate 45% malicious clients in the black-box mode and about 30% in Projected Gradient Descent (PGD) mode. Besides, under adaptive attacks, the results demonstrate that XMAM can still complete the global model training task even when there are 40% malicious clients. Finally, we analyze our method's screening complexity and compare the real screening time with other methods. The results show that XMAM is about 10–10000 times faster than the existing methods.

查看原文本刊更多论文

XMAM：用矩阵透视模型，揭示联合学习的后门攻击

联合学习（FL）是一项新兴技术，因其隐私保护能力而受到越来越多的关注。然而，基础算法 FedAvg 在遭受所谓的后门攻击时很容易受到影响。前人提出了几种稳健的聚合方法。遗憾的是，由于后门攻击的隐蔽性，很多聚合方法都无法抵御后门攻击。为了应对后门攻击的威胁，我们提出了一种新的聚合方法--XMAM（X-raying Models with A Matrix），以揭示后门攻击者提交的恶意本地模型更新。由于我们观察到 Softmax 层的输出显示出恶意更新和良性更新之间的可区分模式，与现有的聚合算法不同，我们将重点放在 Softmax 层的输出上，后门攻击者很难在其中隐藏其恶意行为。具体来说，就像医学 X 光检查一样，我们通过使用矩阵作为输入来获取 Softmax 层的输出，从而调查收集到的本地模型更新。然后，我们通过聚类排除输出异常的更新。在服务器中没有任何训练数据集的情况下，广泛的评估结果表明，我们的 XMAM 可以有效区分恶意本地模型更新和良性更新。例如，当其他方法无法抵御不超过 20% 的恶意客户端后门攻击时，我们的方法在黑盒模式下可以容忍 45% 的恶意客户端，在投影梯度下降（PGD）模式下可以容忍约 30% 的恶意客户端。此外，在自适应攻击下，结果表明即使存在 40% 的恶意客户端，XMAM 仍能完成全局模型训练任务。最后，我们分析了我们方法的筛选复杂性，并将实际筛选时间与其他方法进行了比较。结果表明，XMAM 比现有方法快约 10-10000 倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Digital Communications and Networks Computer Science-Hardware and Architecture

CiteScore

12.80

自引率

5.10%

发文量

915

审稿时长

30 weeks

期刊介绍： Digital Communications and Networks is a prestigious journal that emphasizes on communication systems and networks. We publish only top-notch original articles and authoritative reviews, which undergo rigorous peer-review. We are proud to announce that all our articles are fully Open Access and can be accessed on ScienceDirect. Our journal is recognized and indexed by eminent databases such as the Science Citation Index Expanded (SCIE) and Scopus. In addition to regular articles, we may also consider exceptional conference papers that have been significantly expanded. Furthermore, we periodically release special issues that focus on specific aspects of the field. In conclusion, Digital Communications and Networks is a leading journal that guarantees exceptional quality and accessibility for researchers and scholars in the field of communication systems and networks.