FedSam: Enhancing federated learning accuracy with differential privacy and data heterogeneity mitigation

IF 3.1 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Standards & Interfaces Pub Date : 2025-05-28 DOI:10.1016/j.csi.2025.104019

Hongtao Li , Xinyu Li , Ximeng Liu , Bo Wang , Jie Wang , Youliang Tian

{"title":"FedSam: Enhancing federated learning accuracy with differential privacy and data heterogeneity mitigation","authors":"Hongtao Li , Xinyu Li , Ximeng Liu , Bo Wang , Jie Wang , Youliang Tian","doi":"10.1016/j.csi.2025.104019","DOIUrl":null,"url":null,"abstract":"<div><div>A large-scale model is typically trained on an extensive dataset to update its parameters and enhance its classification capabilities. However, directly using such data can raise significant privacy concerns, especially in the medical field, where datasets often contain sensitive patient information. Federated Learning (FL) offers a solution by enabling multiple parties to collaboratively train a high-performance model without sharing their raw data. Despite this, during the federated training process, attackers can still potentially extract private information from local models. To bolster privacy protections, Differential Privacy (DP) has been introduced to FL, providing stringent safeguards. However, the combination of DP and data heterogeneity can often lead to reduced model accuracy. To tackle these challenges, we introduce a sampling-memory mechanism, FedSam, which improves the accuracy of the global model while maintaining the required noise levels for differential privacy. This mechanism also mitigates the adverse effects of data heterogeneity in heterogeneous federated environments, thereby improving the global model’s overall performance. Experimental evaluations on datasets demonstrate the superiority of our approach. FedSam achieves a classification accuracy of 95.03%, significantly outperforming traditional DP-FedAvg (91.74%) under the same privacy constraints, highlighting FedSam’s robustness and efficiency.</div></div>","PeriodicalId":50635,"journal":{"name":"Computer Standards & Interfaces","volume":"95 ","pages":"Article 104019"},"PeriodicalIF":3.1000,"publicationDate":"2025-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Standards & Interfaces","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0920548925000480","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

A large-scale model is typically trained on an extensive dataset to update its parameters and enhance its classification capabilities. However, directly using such data can raise significant privacy concerns, especially in the medical field, where datasets often contain sensitive patient information. Federated Learning (FL) offers a solution by enabling multiple parties to collaboratively train a high-performance model without sharing their raw data. Despite this, during the federated training process, attackers can still potentially extract private information from local models. To bolster privacy protections, Differential Privacy (DP) has been introduced to FL, providing stringent safeguards. However, the combination of DP and data heterogeneity can often lead to reduced model accuracy. To tackle these challenges, we introduce a sampling-memory mechanism, FedSam, which improves the accuracy of the global model while maintaining the required noise levels for differential privacy. This mechanism also mitigates the adverse effects of data heterogeneity in heterogeneous federated environments, thereby improving the global model’s overall performance. Experimental evaluations on datasets demonstrate the superiority of our approach. FedSam achieves a classification accuracy of 95.03%, significantly outperforming traditional DP-FedAvg (91.74%) under the same privacy constraints, highlighting FedSam’s robustness and efficiency.

Abstract Image

查看原文本刊更多论文

FedSam：通过差异隐私和数据异构缓解来增强联邦学习的准确性

大规模模型通常是在广泛的数据集上进行训练，以更新其参数并增强其分类能力。然而，直接使用这些数据可能会引起严重的隐私问题，特别是在医疗领域，因为数据集通常包含敏感的患者信息。联邦学习（FL）提供了一种解决方案，允许多方在不共享原始数据的情况下协作训练高性能模型。尽管如此，在联合训练过程中，攻击者仍然可能从本地模型中提取私有信息。为了加强隐私保护，差分隐私（DP）被引入到FL，提供严格的保护。然而，DP和数据异质性的结合往往会导致模型精度的降低。为了应对这些挑战，我们引入了一种采样-记忆机制，FedSam，它提高了全局模型的准确性，同时保持了差分隐私所需的噪声水平。该机制还减轻了异构联邦环境中数据异构的不利影响，从而提高了全局模型的整体性能。数据集的实验评估证明了我们方法的优越性。在相同的隐私约束下，FedSam的分类准确率达到95.03%，显著优于传统的dp - fedag(91.74%)，突出了FedSam的鲁棒性和效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Standards & Interfaces 工程技术-计算机：软件工程

CiteScore

11.90

自引率

16.00%

发文量

审稿时长

6 months

期刊介绍： The quality of software, well-defined interfaces (hardware and software), the process of digitalisation, and accepted standards in these fields are essential for building and exploiting complex computing, communication, multimedia and measuring systems. Standards can simplify the design and construction of individual hardware and software components and help to ensure satisfactory interworking. Computer Standards & Interfaces is an international journal dealing specifically with these topics. The journal • Provides information about activities and progress on the definition of computer standards, software quality, interfaces and methods, at national, European and international levels • Publishes critical comments on standards and standards activities • Disseminates user''s experiences and case studies in the application and exploitation of established or emerging standards, interfaces and methods • Offers a forum for discussion on actual projects, standards, interfaces and methods by recognised experts • Stimulates relevant research by providing a specialised refereed medium.