Analysis of model parallelism for distributed neural networks

Adrián Castelló, M. F. Dolz, E. S. Quintana‐Ortí, J. Duato
{"title":"Analysis of model parallelism for distributed neural networks","authors":"Adrián Castelló, M. F. Dolz, E. S. Quintana‐Ortí, J. Duato","doi":"10.1145/3343211.3343218","DOIUrl":null,"url":null,"abstract":"We analyze the performance of model parallelism applied to the training of deep neural networks on clusters. For this study, we elaborate a parameterized analytical performance model that captures the main computational and communication stages in distributed model parallel training. This model is then leveraged to assess the impact on the performance of four representative convolutional neural networks (CNNs) when varying the node throughput in terms of operations per second and memory bandwidth, the number of nodes of the cluster, the bandwidth of the network links, and algorithmic parameters such as the dimension of the batch. As a second contribution of this paper, we discuss the need for specialized collective communication variants of the MPI_Allgather and MPI_Allreduce primitives where the number of \"contributing\" processes differs from the number of processes receiving a copy/part of the result during training. Furthermore, we analyze the effect that the actual implementation of the algorithms underlying the collective communication primitives exert on the performance of the distributed model parallel realization of the selected CNNs.","PeriodicalId":314904,"journal":{"name":"Proceedings of the 26th European MPI Users' Group Meeting","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 26th European MPI Users' Group Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3343211.3343218","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

Abstract

We analyze the performance of model parallelism applied to the training of deep neural networks on clusters. For this study, we elaborate a parameterized analytical performance model that captures the main computational and communication stages in distributed model parallel training. This model is then leveraged to assess the impact on the performance of four representative convolutional neural networks (CNNs) when varying the node throughput in terms of operations per second and memory bandwidth, the number of nodes of the cluster, the bandwidth of the network links, and algorithmic parameters such as the dimension of the batch. As a second contribution of this paper, we discuss the need for specialized collective communication variants of the MPI_Allgather and MPI_Allreduce primitives where the number of "contributing" processes differs from the number of processes receiving a copy/part of the result during training. Furthermore, we analyze the effect that the actual implementation of the algorithms underlying the collective communication primitives exert on the performance of the distributed model parallel realization of the selected CNNs.
分布式神经网络模型并行性分析
分析了模型并行性在深度神经网络聚类训练中的应用。在这项研究中,我们详细阐述了一个参数化的分析性能模型,该模型捕捉了分布式模型并行训练中的主要计算和通信阶段。然后利用该模型来评估在改变节点吞吐量时对四个代表性卷积神经网络(cnn)性能的影响,这些吞吐量包括每秒操作数和内存带宽、集群的节点数量、网络链接的带宽和算法参数(如批处理的维度)。作为本文的第二个贡献,我们讨论了MPI_Allgather和MPI_Allreduce原语的专用集体通信变体的需求,其中“贡献”进程的数量不同于在训练期间接收结果副本/部分的进程的数量。此外,我们分析了基于集体通信原语的算法的实际实现对所选cnn的分布式模型并行实现性能的影响。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信