{"title":"Federated Learning With Differential Privacy Based on Summary Statistics","authors":"Peng Zhang, Pingqing Liu","doi":"10.1002/eng2.70429","DOIUrl":null,"url":null,"abstract":"<p>In data analysis, privacy preserving is receiving more and more attention, privacy concerns results in the formation of “data silos”. Federated learning can accomplish data integrated analysis while protecting data privacy, it is currently an effective way to break the “data silo” dilemma. In this paper, we build a federated learning framework based on differential privacy. First, for each local dataset, the summary statistics of the parameter estimates and the maximum <span></span><math>\n <semantics>\n <mrow>\n <msub>\n <mrow>\n <mi>L</mi>\n </mrow>\n <mrow>\n <mn>2</mn>\n </mrow>\n </msub>\n </mrow>\n <annotation>$$ {L}_2 $$</annotation>\n </semantics></math> norm of the coefficient vector for the polynomial function used to approximate individual log-likelihood function are computed and transmitted to the trust center. Second, at the trust center, gaussian noise is added to the coefficients of the polynomial function which approximates the full log-likelihood function, and the parameter estimates under privacy is obtained from the noise/privacy objective function, and the estimator satisfies <span></span><math>\n <semantics>\n <mrow>\n <mo>(</mo>\n <mi>ε</mi>\n <mo>,</mo>\n <mi>δ</mi>\n <mo>)</mo>\n </mrow>\n <annotation>$$ \\left(\\varepsilon, \\delta \\right) $$</annotation>\n </semantics></math>-DP. In addition, theoretical guarantees are provided for the privacy guarantees and statistical utility of the proposed method. Finally, we verify the utility of the method using numerical simulations and apply our method in the study of salary impact factors.</p>","PeriodicalId":72922,"journal":{"name":"Engineering reports : open access","volume":"7 10","pages":""},"PeriodicalIF":2.0000,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/eng2.70429","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Engineering reports : open access","FirstCategoryId":"1085","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/eng2.70429","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0
Abstract
In data analysis, privacy preserving is receiving more and more attention, privacy concerns results in the formation of “data silos”. Federated learning can accomplish data integrated analysis while protecting data privacy, it is currently an effective way to break the “data silo” dilemma. In this paper, we build a federated learning framework based on differential privacy. First, for each local dataset, the summary statistics of the parameter estimates and the maximum norm of the coefficient vector for the polynomial function used to approximate individual log-likelihood function are computed and transmitted to the trust center. Second, at the trust center, gaussian noise is added to the coefficients of the polynomial function which approximates the full log-likelihood function, and the parameter estimates under privacy is obtained from the noise/privacy objective function, and the estimator satisfies -DP. In addition, theoretical guarantees are provided for the privacy guarantees and statistical utility of the proposed method. Finally, we verify the utility of the method using numerical simulations and apply our method in the study of salary impact factors.