SignDS-FL: Local Differentially Private Federated Learning with Sign-based Dimension Selection

ACM Transactions on Intelligent Systems and Technology (TIST) Pub Date : 2022-03-22 DOI:10.1145/3517820

Xue Jiang, Xuebing Zhou, Jens Grossklags

{"title":"SignDS-FL: Local Differentially Private Federated Learning with Sign-based Dimension Selection","authors":"Xue Jiang, Xuebing Zhou, Jens Grossklags","doi":"10.1145/3517820","DOIUrl":null,"url":null,"abstract":"Federated Learning (FL) [31] is a decentralized learning mechanism that has attracted increasing attention due to its achievements in computational efficiency and privacy preservation. However, recent research highlights that the original FL framework may still reveal sensitive information of clients’ local data from the exchanged local updates and the global model parameters. Local Differential Privacy (LDP), as a rigorous definition of privacy, has been applied to Federated Learning to provide formal privacy guarantees and prevent potential privacy leakage. However, previous LDP-FL solutions suffer from considerable utility loss with an increase of model dimensionality. Recent work [29] proposed a two-stage framework that mitigates the dimension-dependency problem by first selecting one “important” dimension for each local update and then perturbing the dimension value to construct the sparse privatized update. However, the framework may still suffer from utility loss because of the insufficient per-stage privacy budget and slow model convergence. In this article, we propose an improved framework, SignDS-FL, which shares the concept of dimension selection with Reference [29], but saves the privacy cost for the value perturbation stage by assigning random sign values to the selected dimensions. Besides using the single-dimension selection algorithms in Reference [29], we propose an Exponential Mechanism-based Multi-Dimension Selection algorithm that further improves model convergence and accuracy. We evaluate the framework on a number of real-world datasets with both simple logistic regression models and deep neural networks. For training logistic regression models on structured datasets, our framework yields only a \\( \\sim \\) 1%–2% accuracy loss in comparison to a \\( \\sim \\) 5%–15% decrease of accuracy for the baseline methods. For training deep neural networks on image datasets, the accuracy loss of our framework is less than \\( 8\\% \\) and at best only \\( 2\\% \\) . Extensive experimental results show that our framework significantly outperforms the previous LDP-FL solutions and enjoys an advanced utility-privacy balance.","PeriodicalId":123526,"journal":{"name":"ACM Transactions on Intelligent Systems and Technology (TIST)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Intelligent Systems and Technology (TIST)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3517820","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

Federated Learning (FL) [31] is a decentralized learning mechanism that has attracted increasing attention due to its achievements in computational efficiency and privacy preservation. However, recent research highlights that the original FL framework may still reveal sensitive information of clients’ local data from the exchanged local updates and the global model parameters. Local Differential Privacy (LDP), as a rigorous definition of privacy, has been applied to Federated Learning to provide formal privacy guarantees and prevent potential privacy leakage. However, previous LDP-FL solutions suffer from considerable utility loss with an increase of model dimensionality. Recent work [29] proposed a two-stage framework that mitigates the dimension-dependency problem by first selecting one “important” dimension for each local update and then perturbing the dimension value to construct the sparse privatized update. However, the framework may still suffer from utility loss because of the insufficient per-stage privacy budget and slow model convergence. In this article, we propose an improved framework, SignDS-FL, which shares the concept of dimension selection with Reference [29], but saves the privacy cost for the value perturbation stage by assigning random sign values to the selected dimensions. Besides using the single-dimension selection algorithms in Reference [29], we propose an Exponential Mechanism-based Multi-Dimension Selection algorithm that further improves model convergence and accuracy. We evaluate the framework on a number of real-world datasets with both simple logistic regression models and deep neural networks. For training logistic regression models on structured datasets, our framework yields only a \( \sim \) 1%–2% accuracy loss in comparison to a \( \sim \) 5%–15% decrease of accuracy for the baseline methods. For training deep neural networks on image datasets, the accuracy loss of our framework is less than \( 8\% \) and at best only \( 2\% \) . Extensive experimental results show that our framework significantly outperforms the previous LDP-FL solutions and enjoys an advanced utility-privacy balance.

查看原文本刊更多论文

SignDS-FL:基于符号维度选择的局部差分私有联邦学习

联邦学习(FL)[31]是一种分散的学习机制，由于其在计算效率和隐私保护方面的成就而受到越来越多的关注。然而，最近的研究强调，原始的FL框架仍然可能从交换的本地更新和全局模型参数中泄露客户端本地数据的敏感信息。本地差分隐私(LDP)作为一种严格的隐私定义，被应用到联邦学习中，提供正式的隐私保证，防止潜在的隐私泄露。然而，以前的LDP-FL解决方案会随着模型维数的增加而遭受相当大的效用损失。最近的研究[29]提出了一个两阶段框架，通过首先为每个本地更新选择一个“重要”维度，然后扰动维度值来构建稀疏私有更新，从而缓解维度依赖问题。然而，由于每阶段隐私预算不足和模型收敛缓慢，该框架仍然可能遭受效用损失。在本文中，我们提出了一个改进的框架SignDS-FL，它与文献[29]共享维度选择的概念，但通过为所选维度分配随机符号值来节省值扰动阶段的隐私成本。除了使用文献[29]中的一维选择算法外，我们提出了一种基于指数机制的多维选择算法，进一步提高了模型的收敛性和准确性。我们使用简单的逻辑回归模型和深度神经网络在许多真实世界的数据集上评估了该框架。对于在结构化数据集上训练逻辑回归模型，我们的框架只产生\( \sim \) 1%–2% accuracy loss in comparison to a \( \sim \) 5%–15% decrease of accuracy for the baseline methods. For training deep neural networks on image datasets, the accuracy loss of our framework is less than \( 8\% \) and at best only \( 2\% \) . Extensive experimental results show that our framework significantly outperforms the previous LDP-FL solutions and enjoys an advanced utility-privacy balance.

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Intelligent Systems and Technology (TIST)

自引率

0.00%

发文量