同步随机梯度下降中离散机的随时开发

Nuwan S. Ferdinand, Benjamin Gharachorloo, S. Draper
{"title":"同步随机梯度下降中离散机的随时开发","authors":"Nuwan S. Ferdinand, Benjamin Gharachorloo, S. Draper","doi":"10.1109/ICMLA.2017.0-166","DOIUrl":null,"url":null,"abstract":"In this paper we propose an approach to parallelizing synchronous stochastic gradient descent (SGD) that we term “Anytime-Gradients”. The Anytime-Gradients is designed to exploit the work completed by slow compute nodes or “stragglers”. In many approaches work completed by these nodes, while only partial, is discarded completely. To maintain synchronization in our approach, each computational epoch is of fixed duration, and at the end of each epoch, workers send updated parameter vectors to a master mode for combination. The master weights each update by the amount of work done. The Anytime-Gradients scheme is robust to both persistent and non-persistent stragglers and requires no prior knowledge about processor abilities. We show that the scheme effectively exploits stragglers and outperforms existing methods.","PeriodicalId":6636,"journal":{"name":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","volume":"42 1","pages":"141-146"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"21","resultStr":"{\"title\":\"Anytime Exploitation of Stragglers in Synchronous Stochastic Gradient Descent\",\"authors\":\"Nuwan S. Ferdinand, Benjamin Gharachorloo, S. Draper\",\"doi\":\"10.1109/ICMLA.2017.0-166\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we propose an approach to parallelizing synchronous stochastic gradient descent (SGD) that we term “Anytime-Gradients”. The Anytime-Gradients is designed to exploit the work completed by slow compute nodes or “stragglers”. In many approaches work completed by these nodes, while only partial, is discarded completely. To maintain synchronization in our approach, each computational epoch is of fixed duration, and at the end of each epoch, workers send updated parameter vectors to a master mode for combination. The master weights each update by the amount of work done. The Anytime-Gradients scheme is robust to both persistent and non-persistent stragglers and requires no prior knowledge about processor abilities. We show that the scheme effectively exploits stragglers and outperforms existing methods.\",\"PeriodicalId\":6636,\"journal\":{\"name\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"volume\":\"42 1\",\"pages\":\"141-146\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"21\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICMLA.2017.0-166\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICMLA.2017.0-166","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 21

摘要

在本文中,我们提出了一种并行化同步随机梯度下降(SGD)的方法,我们称之为“任意时间梯度”。任意时间梯度的设计是为了利用缓慢的计算节点或“掉队者”完成的工作。在许多方法中,这些节点完成的工作虽然只是部分完成,但被完全丢弃。为了在我们的方法中保持同步,每个计算历元都是固定的持续时间,并且在每个历元结束时,工作人员将更新的参数向量发送到主模式进行组合。主服务器根据完成的工作量对每次更新进行加权。Anytime-Gradients方案对持久性和非持久性掉队者都具有鲁棒性,并且不需要事先了解处理器的能力。我们证明了该方案有效地利用了离散子,并且优于现有的方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Anytime Exploitation of Stragglers in Synchronous Stochastic Gradient Descent
In this paper we propose an approach to parallelizing synchronous stochastic gradient descent (SGD) that we term “Anytime-Gradients”. The Anytime-Gradients is designed to exploit the work completed by slow compute nodes or “stragglers”. In many approaches work completed by these nodes, while only partial, is discarded completely. To maintain synchronization in our approach, each computational epoch is of fixed duration, and at the end of each epoch, workers send updated parameter vectors to a master mode for combination. The master weights each update by the amount of work done. The Anytime-Gradients scheme is robust to both persistent and non-persistent stragglers and requires no prior knowledge about processor abilities. We show that the scheme effectively exploits stragglers and outperforms existing methods.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信