{"title":"Privacy-Preserving Frank-Wolfe on Shuffle Model","authors":"Ling-jie Zhang, Shi-song Wu, Hai Zhang","doi":"10.1007/s10255-024-1095-6","DOIUrl":null,"url":null,"abstract":"<div><p>In this paper, we design the differentially private variants of the classical Frank-Wolfe algorithm with shuffle model in the optimization of machine learning. Under weak assumptions and the generalized linear loss (GLL) structure, we propose a noisy Frank-Wolfe with shuffle model algorithm (NoisyFWS) and a noisy variance-reduced Frank-Wolfe with the shuffle model algorithm (NoisyVRFWS) by adding calibrated laplace noise under shuffling scheme in the <i>ℓ</i><sub><i>p</i></sub>(<i>p</i> ∈ [1, 2])-case, and study their privacy as well as utility guarantees for the Hölder smoothness GLL. In particular, the privacy guarantees are mainly achieved by using advanced composition and privacy amplification by shuffling. The utility bounds of the NoisyFWS and NoisyVRFWS are analyzed and obtained the optimal excess population risks <span>\\({\\cal O}({n^{ - {{1 + \\alpha } \\over {4\\alpha }}}} + {{\\log (d)\\sqrt {\\log ({1 \\mathord{\\left/ {\\vphantom {1 \\delta }} \\right.} \\delta })} } \\over {n\\epsilon\\,}})\\)</span> and <span>\\({\\cal O}({n^{ - {{1 + \\alpha } \\over {4\\alpha }}}} + {{\\log (d)\\sqrt {\\log ({1 \\mathord{\\left/ {\\vphantom {1 \\delta }} \\right.} \\delta })} } \\over {{n^2\\epsilon}\\,}})\\)</span> with gradient complexity <span>\\({\\cal O}({n^{ - {{{{(1 + \\alpha )}^2}} \\over {4{\\alpha ^2}}}}})\\)</span> for <span>\\(\\alpha \\in \\left[ {{1 \\mathord{\\left/ {\\vphantom {1 {\\sqrt 3 ,\\,1}}} \\right.} {\\sqrt 3 ,\\,1}}} \\right]\\)</span>. It turns out that the risk rates under shuffling scheme are a nearly-dimension independent rate, which is consistent with the previous work in some cases. In addition, there is a vital tradeoff between (<i>α, L</i>)-Hölder smoothness GLL and the gradient complexity. The linear gradient complexity <span>\\({\\cal O}(n)\\)</span> is showed by the parameter <i>α</i> = 1.</p></div>","PeriodicalId":6951,"journal":{"name":"Acta Mathematicae Applicatae Sinica, English Series","volume":"40 4","pages":"887 - 907"},"PeriodicalIF":0.9000,"publicationDate":"2024-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Acta Mathematicae Applicatae Sinica, English Series","FirstCategoryId":"100","ListUrlMain":"https://link.springer.com/article/10.1007/s10255-024-1095-6","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"MATHEMATICS, APPLIED","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we design the differentially private variants of the classical Frank-Wolfe algorithm with shuffle model in the optimization of machine learning. Under weak assumptions and the generalized linear loss (GLL) structure, we propose a noisy Frank-Wolfe with shuffle model algorithm (NoisyFWS) and a noisy variance-reduced Frank-Wolfe with the shuffle model algorithm (NoisyVRFWS) by adding calibrated laplace noise under shuffling scheme in the ℓp(p ∈ [1, 2])-case, and study their privacy as well as utility guarantees for the Hölder smoothness GLL. In particular, the privacy guarantees are mainly achieved by using advanced composition and privacy amplification by shuffling. The utility bounds of the NoisyFWS and NoisyVRFWS are analyzed and obtained the optimal excess population risks \({\cal O}({n^{ - {{1 + \alpha } \over {4\alpha }}}} + {{\log (d)\sqrt {\log ({1 \mathord{\left/ {\vphantom {1 \delta }} \right.} \delta })} } \over {n\epsilon\,}})\) and \({\cal O}({n^{ - {{1 + \alpha } \over {4\alpha }}}} + {{\log (d)\sqrt {\log ({1 \mathord{\left/ {\vphantom {1 \delta }} \right.} \delta })} } \over {{n^2\epsilon}\,}})\) with gradient complexity \({\cal O}({n^{ - {{{{(1 + \alpha )}^2}} \over {4{\alpha ^2}}}}})\) for \(\alpha \in \left[ {{1 \mathord{\left/ {\vphantom {1 {\sqrt 3 ,\,1}}} \right.} {\sqrt 3 ,\,1}}} \right]\). It turns out that the risk rates under shuffling scheme are a nearly-dimension independent rate, which is consistent with the previous work in some cases. In addition, there is a vital tradeoff between (α, L)-Hölder smoothness GLL and the gradient complexity. The linear gradient complexity \({\cal O}(n)\) is showed by the parameter α = 1.
期刊介绍:
Acta Mathematicae Applicatae Sinica (English Series) is a quarterly journal established by the Chinese Mathematical Society. The journal publishes high quality research papers from all branches of applied mathematics, and particularly welcomes those from partial differential equations, computational mathematics, applied probability, mathematical finance, statistics, dynamical systems, optimization and management science.