Federated Learning With Differential Privacy Based on Summary Statistics

IF 2 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS
Peng Zhang, Pingqing Liu
{"title":"Federated Learning With Differential Privacy Based on Summary Statistics","authors":"Peng Zhang,&nbsp;Pingqing Liu","doi":"10.1002/eng2.70429","DOIUrl":null,"url":null,"abstract":"<p>In data analysis, privacy preserving is receiving more and more attention, privacy concerns results in the formation of “data silos”. Federated learning can accomplish data integrated analysis while protecting data privacy, it is currently an effective way to break the “data silo” dilemma. In this paper, we build a federated learning framework based on differential privacy. First, for each local dataset, the summary statistics of the parameter estimates and the maximum <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mrow>\n <mi>L</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msub>\n </mrow>\n <annotation>$$ {L}_2 $$</annotation>\n </semantics></math> norm of the coefficient vector for the polynomial function used to approximate individual log-likelihood function are computed and transmitted to the trust center. Second, at the trust center, gaussian noise is added to the coefficients of the polynomial function which approximates the full log-likelihood function, and the parameter estimates under privacy is obtained from the noise/privacy objective function, and the estimator satisfies <span></span><math>\n <semantics>\n <mrow>\n <mo>(</mo>\n <mi>ε</mi>\n <mo>,</mo>\n <mi>δ</mi>\n <mo>)</mo>\n </mrow>\n <annotation>$$ \\left(\\varepsilon, \\delta \\right) $$</annotation>\n </semantics></math>-DP. In addition, theoretical guarantees are provided for the privacy guarantees and statistical utility of the proposed method. Finally, we verify the utility of the method using numerical simulations and apply our method in the study of salary impact factors.</p>","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 10","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70429","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70429","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

In data analysis, privacy preserving is receiving more and more attention, privacy concerns results in the formation of “data silos”. Federated learning can accomplish data integrated analysis while protecting data privacy, it is currently an effective way to break the “data silo” dilemma. In this paper, we build a federated learning framework based on differential privacy. First, for each local dataset, the summary statistics of the parameter estimates and the maximum L 2 $$ {L}_2 $$ norm of the coefficient vector for the polynomial function used to approximate individual log-likelihood function are computed and transmitted to the trust center. Second, at the trust center, gaussian noise is added to the coefficients of the polynomial function which approximates the full log-likelihood function, and the parameter estimates under privacy is obtained from the noise/privacy objective function, and the estimator satisfies ( ε , δ ) $$ \left(\varepsilon, \delta \right) $$ -DP. In addition, theoretical guarantees are provided for the privacy guarantees and statistical utility of the proposed method. Finally, we verify the utility of the method using numerical simulations and apply our method in the study of salary impact factors.

Abstract Image

基于汇总统计的差分隐私联邦学习
在数据分析中,隐私保护越来越受到重视,对隐私的担忧导致了“数据孤岛”的形成。联邦学习可以在保护数据隐私的同时完成数据集成分析,是目前打破“数据孤岛”困境的有效途径。在本文中,我们建立了一个基于差分隐私的联邦学习框架。首先,对于每个本地数据集,计算参数估计的汇总统计量和用于近似单个对数似然函数的多项式函数的系数向量的最大l2 $$ {L}_2 $$范数传送到信任中心。其次,在信任中心,将高斯噪声加入到逼近全对数似然函数的多项式函数的系数中,由噪声/隐私目标函数得到隐私下的参数估计,估计量满足(ε, δ) $$ \left(\varepsilon, \delta \right) $$ -DP;此外,对所提方法的隐私保障和统计效用提供了理论保证。最后,通过数值模拟验证了该方法的有效性,并将该方法应用于薪酬影响因素的研究中。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
CiteScore
5.10
自引率
0.00%
发文量
0
审稿时长
19 weeks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信