高效分布式深度学习的资源使用基准测试

Nathan C Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, V. Gadepally, S. Samsi
{"title":"高效分布式深度学习的资源使用基准测试","authors":"Nathan C Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, V. Gadepally, S. Samsi","doi":"10.1109/HPEC55821.2022.9926375","DOIUrl":null,"url":null,"abstract":"Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources-especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks-natural language processing, computer vision, and chemistry-on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy -saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Benchmarking Resource Usage for Efficient Distributed Deep Learning\",\"authors\":\"Nathan C Frey, Baolin Li, Joseph McDonald, Dan Zhao, Michael Jones, David Bestor, Devesh Tiwari, V. Gadepally, S. Samsi\",\"doi\":\"10.1109/HPEC55821.2022.9926375\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources-especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks-natural language processing, computer vision, and chemistry-on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy -saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.\",\"PeriodicalId\":200071,\"journal\":{\"name\":\"2022 IEEE High Performance Extreme Computing Conference (HPEC)\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE High Performance Extreme Computing Conference (HPEC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPEC55821.2022.9926375\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC55821.2022.9926375","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6

摘要

为了获得巨大的收益,深度学习(DL)工作流需要不断增加的计算和能量预算。因此,有必要了解不同的深度神经网络(dnn)和训练如何利用不断增加的计算和能源资源,特别是跨不同领域和应用的专门计算密集型模型。在本文中,我们在多达424个图形处理单元(gpu)上进行了3400多个实验,训练了一系列代表各种领域/任务的深度网络-自然语言处理,计算机视觉和化学。在训练过程中,我们的实验系统地改变了计算资源特征和节能机制,如功率利用率和GPU时钟速率限制,以捕获和说明每个代表性模型在各种资源和能源约束下表现出的不同权衡和缩放行为。我们拟合幂律模型,该模型描述了训练时间如何随可用的计算资源和能量限制而变化。我们预计这些发现将有助于指导高性能计算提供商优化资源利用,通过选择性地减少不同深度学习任务/工作流程的能耗,同时对训练的影响最小。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Benchmarking Resource Usage for Efficient Distributed Deep Learning
Deep learning (DL) workflows demand an ever-increasing budget of compute and energy in order to achieve outsized gains. As such, it becomes essential to understand how different deep neural networks (DNNs) and training leverage increasing compute and energy resources-especially specialized computationally-intensive models across different domains and applications. In this paper, we conduct over 3,400 experiments training an array of deep networks representing various domains/tasks-natural language processing, computer vision, and chemistry-on up to 424 graphics processing units (GPUs). During training, our experiments systematically vary compute resource characteristics and energy -saving mechanisms such as power utilization and GPU clock rate limits to capture and illustrate the different trade-offs and scaling behaviors each representative model exhibits under various resource and energy-constrained regimes. We fit power law models that describe how training time scales with available compute resources and energy constraints. We anticipate that these findings will help inform and guide high-performance computing providers in optimizing resource utilization, by selectively reducing energy consumption for different deep learning tasks/workflows with minimal impact on training.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信