DC-SGD: Differentially Private SGD With Dynamic Clipping Through Gradient Norm Distribution Estimation

IF 6.3 1区计算机科学 Q1 COMPUTER SCIENCE, THEORY & METHODS

IEEE Transactions on Information Forensics and Security Pub Date : 2025-04-18 DOI:10.1109/TIFS.2025.3557755

Chengkun Wei;Weixian Li;Chen Gong;Wenzhi Chen

{"title":"DC-SGD: Differentially Private SGD With Dynamic Clipping Through Gradient Norm Distribution Estimation","authors":"Chengkun Wei;Weixian Li;Chen Gong;Wenzhi Chen","doi":"10.1109/TIFS.2025.3557755","DOIUrl":null,"url":null,"abstract":"Differentially Private Stochastic Gradient Descent (DP-SGD) is a widely adopted technique for privacy-preserving deep learning. A critical challenge in DP-SGD is selecting the optimal clipping threshold C, which involves balancing the trade-off between clipping bias and noise magnitude, incurring substantial privacy and computing overhead during hyperparameter tuning. In this paper, we propose Dynamic Clipping DP-SGD (DC-SGD), a framework that leverages differentially private histograms to estimate gradient norm distributions and dynamically adjust the clipping threshold C. Our framework includes two novel mechanisms: DC-SGD-P and DC-SGD-E. DC-SGD-P adjusts the clipping threshold based on a percentile of gradient norms, while DC-SGD-E minimizes the expected squared error of gradients to optimize C. These dynamic adjustments significantly reduce the burden of hyperparameter tuning C. The extensive experiments on various deep learning tasks, including image classification and natural language processing, show that our proposed dynamic algorithms achieve up to 9 times acceleration on hyperparameter tuning than DP-SGD. And DC-SGD-E can achieve an accuracy improvement of 10.62% on CIFAR10 than DP-SGD under the same privacy budget of hyperparameter tuning. We conduct rigorous theoretical privacy and convergence analyses, showing that our methods seamlessly integrate with the Adam optimizer. Our results highlight the robust performance and efficiency of DC-SGD, offering a practical solution for differentially private deep learning with reduced computational overhead and enhanced privacy guarantees.","PeriodicalId":13492,"journal":{"name":"IEEE Transactions on Information Forensics and Security","volume":"20 ","pages":"4498-4511"},"PeriodicalIF":6.3000,"publicationDate":"2025-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Information Forensics and Security","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10969624/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 0

Abstract

Differentially Private Stochastic Gradient Descent (DP-SGD) is a widely adopted technique for privacy-preserving deep learning. A critical challenge in DP-SGD is selecting the optimal clipping threshold C, which involves balancing the trade-off between clipping bias and noise magnitude, incurring substantial privacy and computing overhead during hyperparameter tuning. In this paper, we propose Dynamic Clipping DP-SGD (DC-SGD), a framework that leverages differentially private histograms to estimate gradient norm distributions and dynamically adjust the clipping threshold C. Our framework includes two novel mechanisms: DC-SGD-P and DC-SGD-E. DC-SGD-P adjusts the clipping threshold based on a percentile of gradient norms, while DC-SGD-E minimizes the expected squared error of gradients to optimize C. These dynamic adjustments significantly reduce the burden of hyperparameter tuning C. The extensive experiments on various deep learning tasks, including image classification and natural language processing, show that our proposed dynamic algorithms achieve up to 9 times acceleration on hyperparameter tuning than DP-SGD. And DC-SGD-E can achieve an accuracy improvement of 10.62% on CIFAR10 than DP-SGD under the same privacy budget of hyperparameter tuning. We conduct rigorous theoretical privacy and convergence analyses, showing that our methods seamlessly integrate with the Adam optimizer. Our results highlight the robust performance and efficiency of DC-SGD, offering a practical solution for differentially private deep learning with reduced computational overhead and enhanced privacy guarantees.

查看原文本刊更多论文

DC-SGD：通过梯度正态分布估计实现动态裁剪的差分私有SGD

差分私有随机梯度下降（DP-SGD）是一种被广泛采用的隐私保护深度学习技术。DP-SGD的一个关键挑战是选择最佳的裁剪阈值C，这涉及到在裁剪偏置和噪声量级之间的权衡，在超参数调优期间会产生大量的隐私和计算开销。在本文中，我们提出了动态裁剪DP-SGD (DC-SGD)，这是一个利用差分私有直方图来估计梯度范数分布并动态调整裁剪阈值c的框架。我们的框架包括两种新的机制：DC-SGD- p和DC-SGD- e。DC-SGD-P根据梯度规范的百分位数调整裁剪阈值，而DC-SGD-E则最小化梯度的期望平方误差来优化c。这些动态调整显著减轻了超参数调优c的负担。在各种深度学习任务（包括图像分类和自然语言处理）上的大量实验表明，我们提出的动态算法在超参数调优上的加速是DP-SGD的9倍。在相同的超参数调优隐私预算下，DC-SGD-E在CIFAR10上的准确率比DP-SGD提高了10.62%。我们进行了严格的理论隐私和收敛分析，表明我们的方法与Adam优化器无缝集成。我们的研究结果突出了DC-SGD的强大性能和效率，为差分私有深度学习提供了一个实用的解决方案，减少了计算开销，增强了隐私保证。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Information Forensics and Security 工程技术-工程：电子与电气

CiteScore

14.40

自引率

7.40%

发文量

234

审稿时长

6.5 months

期刊介绍： The IEEE Transactions on Information Forensics and Security covers the sciences, technologies, and applications relating to information forensics, information security, biometrics, surveillance and systems applications that incorporate these features