安全聚合的联邦学习保证了多少隐私?

Ahmed Roushdy Elkordy, Jiang Zhang, Yahya H. Ezzeldin, Konstantinos Psounis, Salman Avestimehr
{"title":"安全聚合的联邦学习保证了多少隐私?","authors":"Ahmed Roushdy Elkordy, Jiang Zhang, Yahya H. Ezzeldin, Konstantinos Psounis, Salman Avestimehr","doi":"10.56553/popets-2023-0030","DOIUrl":null,"url":null,"abstract":"Federated learning (FL) has attracted growing interest for enabling privacy-preserving machine learning on data stored at multiple users while avoiding moving the data off-device. However, while data never leaves users’ devices, privacy still cannot be guaranteed since significant computations on users’ training data are shared in the form of trained local models. These local models have recently been shown to pose a substantial privacy threat through different privacy attacks such as model inversion attacks. As a remedy, Secure Aggregation (SA) has been developed as a framework to preserve privacy in FL, by guaranteeing the server can only learn the global aggregated model update but not the individual model updates.While SA ensures no additional information is leaked about the individual model update beyond the aggregated model update, there are no formal guarantees on how much privacy FL with SA can actually offer; as information about the individual dataset can still potentially leak through the aggregated model computed at the server. In this work, we perform a first analysis of the formal privacy guarantees for FL with SA. Specifically, we use Mutual Information (MI) as a quantification metric and derive upper bounds on how much information about each user's dataset can leak through the aggregated model update. When using the FedSGD aggregation algorithm, our theoretical bounds show that the amount of privacy leakage reduces linearly with the number of users participating in FL with SA. To validate our theoretical bounds, we use an MI Neural Estimator to empirically evaluate the privacy leakage under different FL setups on both the MNIST and CIFAR10 datasets. Our experiments verify our theoretical bounds for FedSGD, which show a reduction in privacy leakage as the number of users and local batch size grow, and an increase in privacy leakage as the number of training rounds increases. We also observe similar dependencies for the FedAvg and FedProx protocol.","PeriodicalId":74556,"journal":{"name":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","volume":"163 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"How Much Privacy Does Federated Learning with Secure Aggregation Guarantee?\",\"authors\":\"Ahmed Roushdy Elkordy, Jiang Zhang, Yahya H. Ezzeldin, Konstantinos Psounis, Salman Avestimehr\",\"doi\":\"10.56553/popets-2023-0030\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Federated learning (FL) has attracted growing interest for enabling privacy-preserving machine learning on data stored at multiple users while avoiding moving the data off-device. However, while data never leaves users’ devices, privacy still cannot be guaranteed since significant computations on users’ training data are shared in the form of trained local models. These local models have recently been shown to pose a substantial privacy threat through different privacy attacks such as model inversion attacks. As a remedy, Secure Aggregation (SA) has been developed as a framework to preserve privacy in FL, by guaranteeing the server can only learn the global aggregated model update but not the individual model updates.While SA ensures no additional information is leaked about the individual model update beyond the aggregated model update, there are no formal guarantees on how much privacy FL with SA can actually offer; as information about the individual dataset can still potentially leak through the aggregated model computed at the server. In this work, we perform a first analysis of the formal privacy guarantees for FL with SA. Specifically, we use Mutual Information (MI) as a quantification metric and derive upper bounds on how much information about each user's dataset can leak through the aggregated model update. When using the FedSGD aggregation algorithm, our theoretical bounds show that the amount of privacy leakage reduces linearly with the number of users participating in FL with SA. To validate our theoretical bounds, we use an MI Neural Estimator to empirically evaluate the privacy leakage under different FL setups on both the MNIST and CIFAR10 datasets. Our experiments verify our theoretical bounds for FedSGD, which show a reduction in privacy leakage as the number of users and local batch size grow, and an increase in privacy leakage as the number of training rounds increases. We also observe similar dependencies for the FedAvg and FedProx protocol.\",\"PeriodicalId\":74556,\"journal\":{\"name\":\"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium\",\"volume\":\"163 \",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.56553/popets-2023-0030\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings on Privacy Enhancing Technologies. Privacy Enhancing Technologies Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.56553/popets-2023-0030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

摘要

联邦学习(FL)吸引了越来越多的兴趣,因为它可以在存储在多个用户处的数据上实现保护隐私的机器学习,同时避免将数据移出设备。然而,虽然数据从未离开用户的设备,但由于对用户训练数据的大量计算以训练后的局部模型的形式共享,因此隐私仍然无法得到保证。这些局部模型最近被证明通过不同的隐私攻击(如模型反转攻击)对隐私构成严重威胁。作为补救措施,安全聚合(SA)已被开发为一种框架,通过保证服务器只能学习全局聚合模型更新而不能学习单个模型更新,从而保护FL中的隐私。虽然SA确保除了聚合模型更新之外,关于单个模型更新的其他信息不会泄露,但对于带有SA的FL实际上可以提供多少隐私,并没有正式的保证;因为关于单个数据集的信息仍然可能通过在服务器上计算的聚合模型泄露。在这项工作中,我们对带有SA的FL的正式隐私保证进行了首次分析。具体来说,我们使用互信息(MI)作为量化度量,并推导出每个用户数据集可以通过聚合模型更新泄露多少信息的上限。当使用FedSGD聚合算法时,我们的理论边界表明,隐私泄漏量随着使用SA参与FL的用户数量线性减少。为了验证我们的理论界限,我们使用MI神经估计器对MNIST和CIFAR10数据集上不同FL设置下的隐私泄漏进行了经验评估。我们的实验验证了FedSGD的理论边界,它表明隐私泄漏随着用户数量和本地批大小的增加而减少,隐私泄漏随着训练轮次的增加而增加。我们还观察到fedag和FedProx协议的类似依赖关系。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
How Much Privacy Does Federated Learning with Secure Aggregation Guarantee?
Federated learning (FL) has attracted growing interest for enabling privacy-preserving machine learning on data stored at multiple users while avoiding moving the data off-device. However, while data never leaves users’ devices, privacy still cannot be guaranteed since significant computations on users’ training data are shared in the form of trained local models. These local models have recently been shown to pose a substantial privacy threat through different privacy attacks such as model inversion attacks. As a remedy, Secure Aggregation (SA) has been developed as a framework to preserve privacy in FL, by guaranteeing the server can only learn the global aggregated model update but not the individual model updates.While SA ensures no additional information is leaked about the individual model update beyond the aggregated model update, there are no formal guarantees on how much privacy FL with SA can actually offer; as information about the individual dataset can still potentially leak through the aggregated model computed at the server. In this work, we perform a first analysis of the formal privacy guarantees for FL with SA. Specifically, we use Mutual Information (MI) as a quantification metric and derive upper bounds on how much information about each user's dataset can leak through the aggregated model update. When using the FedSGD aggregation algorithm, our theoretical bounds show that the amount of privacy leakage reduces linearly with the number of users participating in FL with SA. To validate our theoretical bounds, we use an MI Neural Estimator to empirically evaluate the privacy leakage under different FL setups on both the MNIST and CIFAR10 datasets. Our experiments verify our theoretical bounds for FedSGD, which show a reduction in privacy leakage as the number of users and local batch size grow, and an increase in privacy leakage as the number of training rounds increases. We also observe similar dependencies for the FedAvg and FedProx protocol.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
审稿时长
16 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信