STRADS: a distributed framework for scheduled model parallel machine learning

Jin Kyu Kim, Qirong Ho, Seunghak Lee, Xun Zheng, Wei Dai, Garth A. Gibson, E. Xing
{"title":"STRADS: a distributed framework for scheduled model parallel machine learning","authors":"Jin Kyu Kim, Qirong Ho, Seunghak Lee, Xun Zheng, Wei Dai, Garth A. Gibson, E. Xing","doi":"10.1145/2901318.2901331","DOIUrl":null,"url":null,"abstract":"Machine learning (ML) algorithms are commonly applied to big data, using distributed systems that partition the data across machines and allow each machine to read and update all ML model parameters --- a strategy known as data parallelism. An alternative and complimentary strategy, model parallelism, partitions the model parameters for non-shared parallel access and updates, and may periodically repartition the parameters to facilitate communication. Model parallelism is motivated by two challenges that data-parallelism does not usually address: (1) parameters may be dependent, thus naive concurrent updates can introduce errors that slow convergence or even cause algorithm failure; (2) model parameters converge at different rates, thus a small subset of parameters can bottleneck ML algorithm completion. We propose scheduled model parallelism (SchMP), a programming approach that improves ML algorithm convergence speed by efficiently scheduling parameter updates, taking into account parameter dependencies and uneven convergence. To support SchMP at scale, we develop a distributed framework STRADS which optimizes the throughput of SchMP programs, and benchmark four common ML applications written as SchMP programs: LDA topic modeling, matrix factorization, sparse least-squares (Lasso) regression and sparse logistic regression. By improving ML progress per iteration through SchMP programming whilst improving iteration throughput through STRADS we show that SchMP programs running on STRADS outperform non-model-parallel ML implementations: for example, SchMP LDA and SchMP Lasso respectively achieve 10x and 5x faster convergence than recent, well-established baselines.","PeriodicalId":20737,"journal":{"name":"Proceedings of the Eleventh European Conference on Computer Systems","volume":"12 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2016-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"80","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eleventh European Conference on Computer Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2901318.2901331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 80

Abstract

Machine learning (ML) algorithms are commonly applied to big data, using distributed systems that partition the data across machines and allow each machine to read and update all ML model parameters --- a strategy known as data parallelism. An alternative and complimentary strategy, model parallelism, partitions the model parameters for non-shared parallel access and updates, and may periodically repartition the parameters to facilitate communication. Model parallelism is motivated by two challenges that data-parallelism does not usually address: (1) parameters may be dependent, thus naive concurrent updates can introduce errors that slow convergence or even cause algorithm failure; (2) model parameters converge at different rates, thus a small subset of parameters can bottleneck ML algorithm completion. We propose scheduled model parallelism (SchMP), a programming approach that improves ML algorithm convergence speed by efficiently scheduling parameter updates, taking into account parameter dependencies and uneven convergence. To support SchMP at scale, we develop a distributed framework STRADS which optimizes the throughput of SchMP programs, and benchmark four common ML applications written as SchMP programs: LDA topic modeling, matrix factorization, sparse least-squares (Lasso) regression and sparse logistic regression. By improving ML progress per iteration through SchMP programming whilst improving iteration throughput through STRADS we show that SchMP programs running on STRADS outperform non-model-parallel ML implementations: for example, SchMP LDA and SchMP Lasso respectively achieve 10x and 5x faster convergence than recent, well-established baselines.
STRADS:调度模型并行机器学习的分布式框架
机器学习(ML)算法通常应用于大数据,使用分布式系统将数据跨机器分区,并允许每台机器读取和更新所有ML模型参数——一种称为数据并行的策略。另一种补充策略是模型并行性,它将模型参数划分为非共享并行访问和更新,并可以定期重新划分参数以促进通信。模型并行是由数据并行通常无法解决的两个挑战驱动的:(1)参数可能是相互依赖的,因此幼稚的并发更新可能引入错误,减慢收敛速度甚至导致算法失败;(2)模型参数收敛速度不同,一小部分参数可能成为ML算法完成的瓶颈。我们提出了调度模型并行(SchMP),这是一种编程方法,通过有效地调度参数更新来提高ML算法的收敛速度,同时考虑了参数依赖性和不均匀收敛性。为了大规模支持SchMP,我们开发了一个分布式框架STRADS来优化SchMP程序的吞吐量,并对四种常见的ML应用程序进行了基准测试:LDA主题建模、矩阵分解、稀疏最小二乘(Lasso)回归和稀疏逻辑回归。通过SchMP编程提高每次迭代的机器学习进度,同时通过STRADS提高迭代吞吐量,我们发现在STRADS上运行的SchMP程序优于非模型并行的机器学习实现:例如,SchMP LDA和SchMP Lasso分别比最近建立的基线快10倍和5倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信