联邦学习中的稳健对抗防御：探索数据异构的影响

IF 8 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2025-06-04 DOI:10.1109/TIFS.2025.3576594

Qian Li;Di Wu;Dawei Zhou;Chenhao Lin;Shuai Liu;Cong Wang;Chao Shen

{"title":"联邦学习中的稳健对抗防御：探索数据异构的影响","authors":"Qian Li;Di Wu;Dawei Zhou;Chenhao Lin;Shuai Liu;Cong Wang;Chao Shen","doi":"10.1109/TIFS.2025.3576594","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) enables geographically distributed clients to collaboratively train machine learning models by exchanging local model parameters while preserving data privacy. In practice, FL faces two critical challenges. First, it is vulnerable to security issues as malicious clients would artificially harm the functionality of FL by launching poisoning attacks. Second, the inherent data heterogeneity among clients (termed Non-IID data in FL) naturally arises from distributed data ownership and significantly degrades model convergence and accuracy. However, with studies separately devoted to these two research lines, the interplay between data heterogeneity and security remains poorly understood. In this paper, we systematically investigate the relationship between data heterogeneity and adversarial robustness in FL. Specifically, we propose novel data partitioning algorithms that simulate Label-Conditional Non-IID and Feature-Conditional Non-IID with quantifiable heterogeneity levels. Further, we conduct extensive experiments to evaluate classical defense methods in the practical FL environment under state-of-the-art untargeted attacks. With results in various settings, we separately analyze the connection between Non-IID to defenses and attacks. Regarding attacks, with similar effects on models, Non-IID impacts the training in a different way compared with attacks. The interaction between attacks and Non-IID provides an opportunity to cause severe damage to FL. Regarding defenses, Non-IID induces heterogeneity in model distribution among clients which raises the difficulty of maintaining fidelity and robustness for defense methods.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"6005-6018"},"PeriodicalIF":8.0000,"publicationDate":"2025-06-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Robust Adversarial Defenses in Federated Learning: Exploring the Impact of Data Heterogeneity\",\"authors\":\"Qian Li;Di Wu;Dawei Zhou;Chenhao Lin;Shuai Liu;Cong Wang;Chao Shen\",\"doi\":\"10.1109/TIFS.2025.3576594\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated Learning (FL) enables geographically distributed clients to collaboratively train machine learning models by exchanging local model parameters while preserving data privacy. In practice, FL faces two critical challenges. First, it is vulnerable to security issues as malicious clients would artificially harm the functionality of FL by launching poisoning attacks. Second, the inherent data heterogeneity among clients (termed Non-IID data in FL) naturally arises from distributed data ownership and significantly degrades model convergence and accuracy. However, with studies separately devoted to these two research lines, the interplay between data heterogeneity and security remains poorly understood. In this paper, we systematically investigate the relationship between data heterogeneity and adversarial robustness in FL. Specifically, we propose novel data partitioning algorithms that simulate Label-Conditional Non-IID and Feature-Conditional Non-IID with quantifiable heterogeneity levels. Further, we conduct extensive experiments to evaluate classical defense methods in the practical FL environment under state-of-the-art untargeted attacks. With results in various settings, we separately analyze the connection between Non-IID to defenses and attacks. Regarding attacks, with similar effects on models, Non-IID impacts the training in a different way compared with attacks. The interaction between attacks and Non-IID provides an opportunity to cause severe damage to FL. Regarding defenses, Non-IID induces heterogeneity in model distribution among clients which raises the difficulty of maintaining fidelity and robustness for defense methods.\",\"PeriodicalId\":13492,\"journal\":{\"name\":\"IEEE Transactions on Information Forensics and Security\",\"volume\":\"20 \",\"pages\":\"6005-6018\"},\"PeriodicalIF\":8.0000,\"publicationDate\":\"2025-06-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Information Forensics and Security\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/11024045/\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11024045/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

摘要

联邦学习（FL）使地理上分布的客户端能够通过交换本地模型参数来协作训练机器学习模型，同时保护数据隐私。在实践中，FL面临着两个关键的挑战。首先，它容易受到安全问题的影响，因为恶意客户端会通过发起中毒攻击人为地损害FL的功能。其次，客户端之间固有的数据异质性（在FL中称为非iid数据）自然源于分布式数据所有权，并显著降低了模型的收敛性和准确性。然而，由于研究分别致力于这两个研究方向，数据异构性和安全性之间的相互作用仍然知之甚少。在本文中，我们系统地研究了数据异质性与FL中对抗鲁棒性之间的关系。具体来说，我们提出了新的数据划分算法，模拟具有可量化异质性水平的标签条件非iid和特征条件非iid。此外，我们进行了广泛的实验，以评估在最先进的非目标攻击下实际FL环境中的经典防御方法。根据不同设置的结果，我们分别分析了Non-IID与防御和攻击之间的联系。攻击对模型的影响相似，但与攻击相比，Non-IID对训练的影响方式不同。攻击和Non-IID之间的交互提供了对FL造成严重损害的机会。在防御方面，Non-IID导致客户端之间模型分布的异质性，这增加了保持防御方法保真度和鲁棒性的难度。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Robust Adversarial Defenses in Federated Learning: Exploring the Impact of Data Heterogeneity

Federated Learning (FL) enables geographically distributed clients to collaboratively train machine learning models by exchanging local model parameters while preserving data privacy. In practice, FL faces two critical challenges. First, it is vulnerable to security issues as malicious clients would artificially harm the functionality of FL by launching poisoning attacks. Second, the inherent data heterogeneity among clients (termed Non-IID data in FL) naturally arises from distributed data ownership and significantly degrades model convergence and accuracy. However, with studies separately devoted to these two research lines, the interplay between data heterogeneity and security remains poorly understood. In this paper, we systematically investigate the relationship between data heterogeneity and adversarial robustness in FL. Specifically, we propose novel data partitioning algorithms that simulate Label-Conditional Non-IID and Feature-Conditional Non-IID with quantifiable heterogeneity levels. Further, we conduct extensive experiments to evaluate classical defense methods in the practical FL environment under state-of-the-art untargeted attacks. With results in various settings, we separately analyze the connection between Non-IID to defenses and attacks. Regarding attacks, with similar effects on models, Non-IID impacts the training in a different way compared with attacks. The interaction between attacks and Non-IID provides an opportunity to cause severe damage to FL. Regarding defenses, Non-IID induces heterogeneity in model distribution among clients which raises the difficulty of maintaining fidelity and robustness for defense methods.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features