Anytime Exploitation of Stragglers in Synchronous Stochastic Gradient Descent

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) Pub Date : 2017-12-01 DOI:10.1109/ICMLA.2017.0-166

Nuwan S. Ferdinand, Benjamin Gharachorloo, S. Draper

引用次数: 21

Abstract

In this paper we propose an approach to parallelizing synchronous stochastic gradient descent (SGD) that we term “Anytime-Gradients”. The Anytime-Gradients is designed to exploit the work completed by slow compute nodes or “stragglers”. In many approaches work completed by these nodes, while only partial, is discarded completely. To maintain synchronization in our approach, each computational epoch is of fixed duration, and at the end of each epoch, workers send updated parameter vectors to a master mode for combination. The master weights each update by the amount of work done. The Anytime-Gradients scheme is robust to both persistent and non-persistent stragglers and requires no prior knowledge about processor abilities. We show that the scheme effectively exploits stragglers and outperforms existing methods.

查看原文本刊更多论文

同步随机梯度下降中离散机的随时开发

在本文中，我们提出了一种并行化同步随机梯度下降(SGD)的方法，我们称之为“任意时间梯度”。任意时间梯度的设计是为了利用缓慢的计算节点或“掉队者”完成的工作。在许多方法中，这些节点完成的工作虽然只是部分完成，但被完全丢弃。为了在我们的方法中保持同步，每个计算历元都是固定的持续时间，并且在每个历元结束时，工作人员将更新的参数向量发送到主模式进行组合。主服务器根据完成的工作量对每次更新进行加权。Anytime-Gradients方案对持久性和非持久性掉队者都具有鲁棒性，并且不需要事先了解处理器的能力。我们证明了该方案有效地利用了离散子，并且优于现有的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

自引率

0.00%

发文量