{"title":"Distributed Differentially Private Stochastic Gradient Descent: An Empirical Study","authors":"István Hegedüs, Márk Jelasity","doi":"10.1109/PDP.2016.19","DOIUrl":null,"url":null,"abstract":"In fault-prone large-scale distributed environments stochastic gradient descent (SGD) is a popular approach to implement machine learning algorithms. Data privacy is a key concern in such environments, which is often addressed within the framework of differential privacy. The output quality of differentially private SGD implementations as a function of design choices has not yet been thoroughly evaluated. In this study, we examine this problem experimentally. We assume that every data record is stored by an independent node, which is a typical setup in networks of mobile devices or Internet of things (IoT) applications. In this model we identify a set of possible distributed differentially private SGD implementations. In these implementations all the sensitive computations are strictly local, and any public information is protected by differentially private mechanisms. This means that personal information can leak only if the corresponding node is directly compromised. We then perform a set of experiments to evaluate these implementations over several machine learning problems with both logistic regression and support vector machine (SVM) loss functions. Depending on the parameter setting and the choice of the algorithm, the performance of the noise-free algorithm can be closely approximated by differentially private variants.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2016.19","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
In fault-prone large-scale distributed environments stochastic gradient descent (SGD) is a popular approach to implement machine learning algorithms. Data privacy is a key concern in such environments, which is often addressed within the framework of differential privacy. The output quality of differentially private SGD implementations as a function of design choices has not yet been thoroughly evaluated. In this study, we examine this problem experimentally. We assume that every data record is stored by an independent node, which is a typical setup in networks of mobile devices or Internet of things (IoT) applications. In this model we identify a set of possible distributed differentially private SGD implementations. In these implementations all the sensitive computations are strictly local, and any public information is protected by differentially private mechanisms. This means that personal information can leak only if the corresponding node is directly compromised. We then perform a set of experiments to evaluate these implementations over several machine learning problems with both logistic regression and support vector machine (SVM) loss functions. Depending on the parameter setting and the choice of the algorithm, the performance of the noise-free algorithm can be closely approximated by differentially private variants.