Federated Learning With Differential Privacy Based on Summary Statistics

IF 2 Q3 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS

Engineering reports : open access Pub Date : 2025-10-20 DOI:10.1002/eng2.70429

Peng Zhang, Pingqing Liu

{"title":"Federated Learning With Differential Privacy Based on Summary Statistics","authors":"Peng Zhang, Pingqing Liu","doi":"10.1002/eng2.70429","DOIUrl":null,"url":null,"abstract":"In data analysis, privacy preserving is receiving more and more attention, privacy concerns results in the formation of “data silos”. Federated learning can accomplish data integrated analysis while protecting data privacy, it is currently an effective way to break the “data silo” dilemma. In this paper, we build a federated learning framework based on differential privacy. First, for each local dataset, the summary statistics of the parameter estimates and the maximum <math>\n <semantics>\n <mrow>\n <msub>\n <mrow>\n <mi>L</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msub>\n </mrow>\n <annotation>$$ {L}_2 $$</annotation>\n </semantics></math> norm of the coefficient vector for the polynomial function used to approximate individual log-likelihood function are computed and transmitted to the trust center. Second, at the trust center, gaussian noise is added to the coefficients of the polynomial function which approximates the full log-likelihood function, and the parameter estimates under privacy is obtained from the noise/privacy objective function, and the estimator satisfies <math>\n <semantics>\n <mrow>\n <mo>(</mo>\n <mi>ε</mi>\n <mo>,</mo>\n <mi>δ</mi>\n <mo>)</mo>\n </mrow>\n <annotation>$$ \\left(\\varepsilon, \\delta \\right) $$</annotation>\n </semantics></math>-DP. In addition, theoretical guarantees are provided for the privacy guarantees and statistical utility of the proposed method. Finally, we verify the utility of the method using numerical simulations and apply our method in the study of salary impact factors.","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 10","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70429","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70429","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}

引用次数: 0

Abstract

In data analysis, privacy preserving is receiving more and more attention, privacy concerns results in the formation of “data silos”. Federated learning can accomplish data integrated analysis while protecting data privacy, it is currently an effective way to break the “data silo” dilemma. In this paper, we build a federated learning framework based on differential privacy. First, for each local dataset, the summary statistics of the parameter estimates and the maximum $L_{2}$ norm of the coefficient vector for the polynomial function used to approximate individual log-likelihood function are computed and transmitted to the trust center. Second, at the trust center, gaussian noise is added to the coefficients of the polynomial function which approximates the full log-likelihood function, and the parameter estimates under privacy is obtained from the noise/privacy objective function, and the estimator satisfies $(ε, δ)$ -DP. In addition, theoretical guarantees are provided for the privacy guarantees and statistical utility of the proposed method. Finally, we verify the utility of the method using numerical simulations and apply our method in the study of salary impact factors.

Abstract Image

查看原文本刊更多论文

基于汇总统计的差分隐私联邦学习

在数据分析中，隐私保护越来越受到重视，对隐私的担忧导致了“数据孤岛”的形成。联邦学习可以在保护数据隐私的同时完成数据集成分析，是目前打破“数据孤岛”困境的有效途径。在本文中，我们建立了一个基于差分隐私的联邦学习框架。首先，对于每个本地数据集，计算参数估计的汇总统计量和用于近似单个对数似然函数的多项式函数的系数向量的最大l2 $$ {L}_2 $$范数传送到信任中心。其次，在信任中心，将高斯噪声加入到逼近全对数似然函数的多项式函数的系数中，由噪声/隐私目标函数得到隐私下的参数估计，估计量满足（ε, δ） $$ \left(\varepsilon, \delta \right) $$ -DP；此外，对所提方法的隐私保障和统计效用提供了理论保证。最后，通过数值模拟验证了该方法的有效性，并将该方法应用于薪酬影响因素的研究中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊