Towards optimal placement and scheduling of DNN operations with Pesto

Ubaid Ullah Hafeez, Xiao Sun, Anshul Gandhi, Zhenhua Liu
{"title":"Towards optimal placement and scheduling of DNN operations with Pesto","authors":"Ubaid Ullah Hafeez, Xiao Sun, Anshul Gandhi, Zhenhua Liu","doi":"10.1145/3464298.3476132","DOIUrl":null,"url":null,"abstract":"The increasing size of Deep Neural Networks (DNNs) has necessitated the use of multiple GPUs to host a single DNN model, a practice commonly referred to as model parallelism. The key challenge for model parallelism is to efficiently and effectively partition the DNN model across GPUs to avoid communication overheads while maximizing the GPU utilization, with the end-goal of minimizing the training time of DNN models. Existing approaches either take a long time(hours or even days) to find an effective partition or settle for sub-optimal partitioning, invariably increasing the end-to-end training effort. In this paper, we design and implement Pesto, a fast and near-optimal model placement technique for automatically partitioning arbitrary DNNs across multiple GPUs. The key idea in Pesto is to jointly optimize the model placement and scheduling at the fine-grained operation level to minimize inter-GPU communication while maximizing the opportunity to parallelize the model across GPUs. By carefully formulating the problem as an integer program, Pesto can provide the optimal placement and scheduling. We implement Pesto in TensorFlow and show that Pesto can reduce model training time by up to 31% compared to state-of-the-art approaches, across several large DNN models.","PeriodicalId":154994,"journal":{"name":"Proceedings of the 22nd International Middleware Conference","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 22nd International Middleware Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3464298.3476132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

The increasing size of Deep Neural Networks (DNNs) has necessitated the use of multiple GPUs to host a single DNN model, a practice commonly referred to as model parallelism. The key challenge for model parallelism is to efficiently and effectively partition the DNN model across GPUs to avoid communication overheads while maximizing the GPU utilization, with the end-goal of minimizing the training time of DNN models. Existing approaches either take a long time(hours or even days) to find an effective partition or settle for sub-optimal partitioning, invariably increasing the end-to-end training effort. In this paper, we design and implement Pesto, a fast and near-optimal model placement technique for automatically partitioning arbitrary DNNs across multiple GPUs. The key idea in Pesto is to jointly optimize the model placement and scheduling at the fine-grained operation level to minimize inter-GPU communication while maximizing the opportunity to parallelize the model across GPUs. By carefully formulating the problem as an integer program, Pesto can provide the optimal placement and scheduling. We implement Pesto in TensorFlow and show that Pesto can reduce model training time by up to 31% compared to state-of-the-art approaches, across several large DNN models.
用Pesto实现DNN操作的最优布局和调度
深度神经网络(DNN)的规模越来越大,需要使用多个gpu来托管单个DNN模型,这种做法通常被称为模型并行。模型并行化的关键挑战是如何高效、有效地跨GPU划分DNN模型,以避免通信开销,同时最大限度地提高GPU利用率,最终目标是最小化DNN模型的训练时间。现有的方法要么花费很长时间(几个小时甚至几天)来找到一个有效的划分,要么满足于次优划分,总是增加端到端的训练工作。在本文中,我们设计并实现了Pesto,一种快速且接近最优的模型放置技术,用于在多个gpu上自动划分任意dnn。Pesto的关键思想是在细粒度操作级别上共同优化模型放置和调度,以最小化gpu间的通信,同时最大化跨gpu并行化模型的机会。通过将问题仔细地表述为一个整数程序,Pesto可以提供最优的布局和调度。我们在TensorFlow中实现了Pesto,并表明在几个大型DNN模型中,与最先进的方法相比,Pesto可以将模型训练时间减少多达31%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信