通信高效自适应梯度方法研究

Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference Pub Date : 2020-10-18 DOI:10.1145/3412815.3416891

Xiangyi Chen, Xiaoyun Li, P. Li

{"title":"通信高效自适应梯度方法研究","authors":"Xiangyi Chen, Xiaoyun Li, P. Li","doi":"10.1145/3412815.3416891","DOIUrl":null,"url":null,"abstract":"In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of training speed in distributed training is gradually shifting from computation to communication. Meanwhile, in the hope of training machine learning models on mobile devices, a new distributed training paradigm called \"federated learning'' has become popular. The communication time in federated learning is especially important due to the low bandwidth of mobile devices. While various approaches to improve the communication efficiency have been proposed for federated learning, most of them are designed with SGD as the prototype training algorithm. While adaptive gradient methods have been proven effective for training neural nets, the study of adaptive gradient methods in federated learning is scarce. In this paper, we propose an adaptive gradient method that can guarantee both the convergence and the communication efficiency for federated learning.","PeriodicalId":176130,"journal":{"name":"Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"32","resultStr":"{\"title\":\"Toward Communication Efficient Adaptive Gradient Method\",\"authors\":\"Xiangyi Chen, Xiaoyun Li, P. Li\",\"doi\":\"10.1145/3412815.3416891\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of training speed in distributed training is gradually shifting from computation to communication. Meanwhile, in the hope of training machine learning models on mobile devices, a new distributed training paradigm called \\\"federated learning'' has become popular. The communication time in federated learning is especially important due to the low bandwidth of mobile devices. While various approaches to improve the communication efficiency have been proposed for federated learning, most of them are designed with SGD as the prototype training algorithm. While adaptive gradient methods have been proven effective for training neural nets, the study of adaptive gradient methods in federated learning is scarce. In this paper, we propose an adaptive gradient method that can guarantee both the convergence and the communication efficiency for federated learning.\",\"PeriodicalId\":176130,\"journal\":{\"name\":\"Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"32\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3412815.3416891\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3412815.3416891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 32

摘要

近年来，分布式优化被证明是加速深度神经网络等大规模机器学习模型训练的有效方法。随着gpu计算能力的不断提高，分布式训练中训练速度的瓶颈逐渐从计算向通信转移。同时，为了在移动设备上训练机器学习模型，一种名为“联邦学习”的新型分布式训练范式已经流行起来。由于移动设备的低带宽，联邦学习中的通信时间尤为重要。虽然已经提出了各种提高联邦学习通信效率的方法，但大多数都是以SGD作为原型训练算法来设计的。虽然自适应梯度方法已经被证明是有效的神经网络训练方法，但对自适应梯度方法在联邦学习中的研究却很少。本文提出了一种自适应梯度方法，保证了联邦学习的收敛性和通信效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Toward Communication Efficient Adaptive Gradient Method

In recent years, distributed optimization is proven to be an effective approach to accelerate training of large scale machine learning models such as deep neural networks. With the increasing computation power of GPUs, the bottleneck of training speed in distributed training is gradually shifting from computation to communication. Meanwhile, in the hope of training machine learning models on mobile devices, a new distributed training paradigm called "federated learning'' has become popular. The communication time in federated learning is especially important due to the low bandwidth of mobile devices. While various approaches to improve the communication efficiency have been proposed for federated learning, most of them are designed with SGD as the prototype training algorithm. While adaptive gradient methods have been proven effective for training neural nets, the study of adaptive gradient methods in federated learning is scarce. In this paper, we propose an adaptive gradient method that can guarantee both the convergence and the communication efficiency for federated learning.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference

自引率

0.00%

发文量