基于不同硬件配置的深度学习实验研究

2017 International Conference on Networking, Architecture, and Storage (NAS) Pub Date : 2017-08-01 DOI:10.1109/NAS.2017.8026843

Jingjun Li, Chen Zhang, Q. Cao, Chuanyi Qi, Jianzhong Huang, C. Xie

{"title":"基于不同硬件配置的深度学习实验研究","authors":"Jingjun Li, Chen Zhang, Q. Cao, Chuanyi Qi, Jianzhong Huang, C. Xie","doi":"10.1109/NAS.2017.8026843","DOIUrl":null,"url":null,"abstract":"Deep learning has exhibited high accuracy and applicability in machine learning field recently, by consuming tremendous computational resources processing massive data. To improve the performance of deep learning, GPUs have been introduced to accelerate the training phase. The complex data processing infrastructure demands high-efficient collaboration among underlying hardware components, such as CPU, GPU, memory, and storage devices. Unfortunately, few work has presented a systematic analysis about the impact of hardware configurations on the overall performance of deep learning. In this paper, we aim to make an experimental study on a standalone system to evaluate how various hardware configurations affect the overall performance of deep learning. We conducted a series of experiments using varied configurations on storage devices, main memory, CPU, and GPU to observe the overall performance quantitatively. Based on analyzing these results, we found that the performance greatly relies on the hardware configurations. Specifically, the computation is still the primary bottleneck as double GPUs and triple GPUs shorten the execution time by 44\\% and 59\\% respectively. Besides, both CPU frequency and storage subsystem can significantly affect running time while the memory size has no obvious effect on the running time for training neural network models. We believe our experimental results can help shed light on further optimizing the performance of deep learning in computer systems.","PeriodicalId":222161,"journal":{"name":"2017 International Conference on Networking, Architecture, and Storage (NAS)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"An Experimental Study on Deep Learning Based on Different Hardware Configurations\",\"authors\":\"Jingjun Li, Chen Zhang, Q. Cao, Chuanyi Qi, Jianzhong Huang, C. Xie\",\"doi\":\"10.1109/NAS.2017.8026843\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Deep learning has exhibited high accuracy and applicability in machine learning field recently, by consuming tremendous computational resources processing massive data. To improve the performance of deep learning, GPUs have been introduced to accelerate the training phase. The complex data processing infrastructure demands high-efficient collaboration among underlying hardware components, such as CPU, GPU, memory, and storage devices. Unfortunately, few work has presented a systematic analysis about the impact of hardware configurations on the overall performance of deep learning. In this paper, we aim to make an experimental study on a standalone system to evaluate how various hardware configurations affect the overall performance of deep learning. We conducted a series of experiments using varied configurations on storage devices, main memory, CPU, and GPU to observe the overall performance quantitatively. Based on analyzing these results, we found that the performance greatly relies on the hardware configurations. Specifically, the computation is still the primary bottleneck as double GPUs and triple GPUs shorten the execution time by 44\\\\% and 59\\\\% respectively. Besides, both CPU frequency and storage subsystem can significantly affect running time while the memory size has no obvious effect on the running time for training neural network models. We believe our experimental results can help shed light on further optimizing the performance of deep learning in computer systems.\",\"PeriodicalId\":222161,\"journal\":{\"name\":\"2017 International Conference on Networking, Architecture, and Storage (NAS)\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 International Conference on Networking, Architecture, and Storage (NAS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/NAS.2017.8026843\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 International Conference on Networking, Architecture, and Storage (NAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/NAS.2017.8026843","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

深度学习通过消耗大量的计算资源来处理海量数据，近年来在机器学习领域表现出了很高的准确性和适用性。为了提高深度学习的性能，引入了gpu来加速训练阶段。复杂的数据处理基础设施要求底层硬件组件(如CPU、GPU、内存和存储设备)之间高效协作。不幸的是，很少有工作对硬件配置对深度学习整体性能的影响进行系统分析。在本文中，我们的目标是在一个独立的系统上进行实验研究，以评估不同的硬件配置如何影响深度学习的整体性能。我们在存储设备、主存、CPU和GPU上进行了一系列不同配置的实验，定量观察整体性能。通过分析这些结果，我们发现性能很大程度上依赖于硬件配置。具体来说，计算仍然是主要的瓶颈，因为双gpu和三gpu分别缩短了44%和59%的执行时间。此外，CPU频率和存储子系统对训练神经网络模型的运行时间都有显著影响，而内存大小对训练神经网络模型的运行时间没有明显影响。我们相信我们的实验结果可以帮助进一步优化计算机系统中深度学习的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Experimental Study on Deep Learning Based on Different Hardware Configurations

Deep learning has exhibited high accuracy and applicability in machine learning field recently, by consuming tremendous computational resources processing massive data. To improve the performance of deep learning, GPUs have been introduced to accelerate the training phase. The complex data processing infrastructure demands high-efficient collaboration among underlying hardware components, such as CPU, GPU, memory, and storage devices. Unfortunately, few work has presented a systematic analysis about the impact of hardware configurations on the overall performance of deep learning. In this paper, we aim to make an experimental study on a standalone system to evaluate how various hardware configurations affect the overall performance of deep learning. We conducted a series of experiments using varied configurations on storage devices, main memory, CPU, and GPU to observe the overall performance quantitatively. Based on analyzing these results, we found that the performance greatly relies on the hardware configurations. Specifically, the computation is still the primary bottleneck as double GPUs and triple GPUs shorten the execution time by 44\% and 59\% respectively. Besides, both CPU frequency and storage subsystem can significantly affect running time while the memory size has no obvious effect on the running time for training neural network models. We believe our experimental results can help shed light on further optimizing the performance of deep learning in computer systems.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2017 International Conference on Networking, Architecture, and Storage (NAS)

自引率

0.00%

发文量