AC922架构对深度神经网络训练性能的影响

P. Rosciszewski, Michał Iwański, P. Czarnul
{"title":"AC922架构对深度神经网络训练性能的影响","authors":"P. Rosciszewski, Michał Iwański, P. Czarnul","doi":"10.1109/HPCS48598.2019.9188164","DOIUrl":null,"url":null,"abstract":"Practical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report performance results depending on batch sizes and GPU selection and compare them with the results from another contemporary workstation based on the same set of GPUs – NVIDIA® DGX Station™. The results show that the AC922 performs better in all tested configurations, achieving improvements up to 10.3%. Profiling indicates that the improvement is due to the efficient I/O pipeline. The performance differences depend on the specific model, rather than on the model class (RNN/CNN). Both systems offer good scalability up to 4 GPUs. In certain cases there is a significant difference in performance depending on exactly which GPUs are used for computations.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"61 1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"The impact of the AC922 Architecture on Performance of Deep Neural Network Training\",\"authors\":\"P. Rosciszewski, Michał Iwański, P. Czarnul\",\"doi\":\"10.1109/HPCS48598.2019.9188164\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Practical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report performance results depending on batch sizes and GPU selection and compare them with the results from another contemporary workstation based on the same set of GPUs – NVIDIA® DGX Station™. The results show that the AC922 performs better in all tested configurations, achieving improvements up to 10.3%. Profiling indicates that the improvement is due to the efficient I/O pipeline. The performance differences depend on the specific model, rather than on the model class (RNN/CNN). Both systems offer good scalability up to 4 GPUs. In certain cases there is a significant difference in performance depending on exactly which GPUs are used for computations.\",\"PeriodicalId\":371856,\"journal\":{\"name\":\"2019 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"61 1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCS48598.2019.9188164\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS48598.2019.9188164","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

实际的深度学习应用需要越来越多的计算能力。新的计算架构出现了,专门为人工智能应用设计,包括IBM Power System AC922。在本文中,我们面对一个配备4个NVIDIA Volta V100 gpu的AC922 (8335-GTG)服务器,并选择深度神经网络训练应用,包括四个卷积模型和一个循环模型。我们根据批处理大小和GPU选择报告性能结果,并将其与基于同一组GPU的另一个现代工作站(NVIDIA®DGX Station™)的结果进行比较。结果表明,AC922在所有测试配置中都表现较好,实现了10.3%的改进。分析表明,改进是由于高效的I/O管道。性能差异取决于特定的模型,而不是模型类(RNN/CNN)。这两个系统都提供了良好的可扩展性,最多可达4个gpu。在某些情况下,根据使用哪种gpu进行计算,性能会有显著差异。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
The impact of the AC922 Architecture on Performance of Deep Neural Network Training
Practical deep learning applications require more and more computing power. New computing architectures emerge, specifically designed for the artificial intelligence applications, including the IBM Power System AC922. In this paper we confront an AC922 (8335-GTG) server equipped with 4 NVIDIA Volta V100 GPUs with selected deep neural network training applications, including four convolutional and one recurrent model. We report performance results depending on batch sizes and GPU selection and compare them with the results from another contemporary workstation based on the same set of GPUs – NVIDIA® DGX Station™. The results show that the AC922 performs better in all tested configurations, achieving improvements up to 10.3%. Profiling indicates that the improvement is due to the efficient I/O pipeline. The performance differences depend on the specific model, rather than on the model class (RNN/CNN). Both systems offer good scalability up to 4 GPUs. In certain cases there is a significant difference in performance depending on exactly which GPUs are used for computations.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信