{"title":"OSP: Overlapping Computation and Communication in Parameter Server for Fast Machine Learning","authors":"Haozhao Wang, Song Guo, Ruixuan Li","doi":"10.1145/3337821.3337828","DOIUrl":null,"url":null,"abstract":"When running in Parameter Server (PS), the Distributed Stochastic Gradient Descent (SGD) incurs significant communication delays because after pushing their updates, computing nodes (workers) have to wait for the global model to be communicated back from the master in every iteration. In this paper, we devise a new synchronization parallel mechanism named overlap synchronization parallel (OSP), in which the waiting time is removed by conducting computation and communication in an overlapped manner. We theoretically prove that our mechanism could achieve the same convergence rate compared to the sequential SGD for non-convex problems. Evaluations show that our mechanism significantly improves performance over the state-of-the-art ones, e.g., by 4× for both AlexNet and ResNet18 in terms of convergence speed.","PeriodicalId":405273,"journal":{"name":"Proceedings of the 48th International Conference on Parallel Processing","volume":"132 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 48th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3337821.3337828","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
When running in Parameter Server (PS), the Distributed Stochastic Gradient Descent (SGD) incurs significant communication delays because after pushing their updates, computing nodes (workers) have to wait for the global model to be communicated back from the master in every iteration. In this paper, we devise a new synchronization parallel mechanism named overlap synchronization parallel (OSP), in which the waiting time is removed by conducting computation and communication in an overlapped manner. We theoretically prove that our mechanism could achieve the same convergence rate compared to the sequential SGD for non-convex problems. Evaluations show that our mechanism significantly improves performance over the state-of-the-art ones, e.g., by 4× for both AlexNet and ResNet18 in terms of convergence speed.