Applicability of Deep Learning Model Trainings on Embedded GPU Devices: An Empirical Study

2023 12th Mediterranean Conference on Embedded Computing (MECO) Pub Date : 2023-06-06 DOI:10.1109/MECO58584.2023.10155048

Po-Hsuan Chou, Chao Wang, Chih-Shuo Mei

{"title":"Applicability of Deep Learning Model Trainings on Embedded GPU Devices: An Empirical Study","authors":"Po-Hsuan Chou, Chao Wang, Chih-Shuo Mei","doi":"10.1109/MECO58584.2023.10155048","DOIUrl":null,"url":null,"abstract":"The wide applications of deep learning techniques have motivated the inclusion of both embedded GPU devices and workstation GPU cards into contemporary Industrial Internet-of-Things (IIoT) systems. Due to substantial differences between the two types of GPUs, deep-learning model training in its current practice is run on GPU cards, and embedded GPU devices are used for inferences or partial model training at best. To supply with empirical evidence and aid the decision of deep learning workload placement, this paper reports a set of experiments on the timeliness and energy efficiency of each GPU type, running both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model training. The results suggest that embedded GPUs did save the total energy cost despite the longer response time, but the amount of energy saving might not be significant in a practical sense. Further in this paper we report a case study for prognostics applications using LSTM. The results suggest that, by comparison, an embedded GPU may save about 90 percent of energy consumption at the cost of doubling the application response time. But neither the save in energy cost nor the increase in response time is significant enough to impact the application. These findings suggest that it may be feasible to place model training workload on either workstation GPU or embedded GPU.","PeriodicalId":187825,"journal":{"name":"2023 12th Mediterranean Conference on Embedded Computing (MECO)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 12th Mediterranean Conference on Embedded Computing (MECO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MECO58584.2023.10155048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The wide applications of deep learning techniques have motivated the inclusion of both embedded GPU devices and workstation GPU cards into contemporary Industrial Internet-of-Things (IIoT) systems. Due to substantial differences between the two types of GPUs, deep-learning model training in its current practice is run on GPU cards, and embedded GPU devices are used for inferences or partial model training at best. To supply with empirical evidence and aid the decision of deep learning workload placement, this paper reports a set of experiments on the timeliness and energy efficiency of each GPU type, running both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model training. The results suggest that embedded GPUs did save the total energy cost despite the longer response time, but the amount of energy saving might not be significant in a practical sense. Further in this paper we report a case study for prognostics applications using LSTM. The results suggest that, by comparison, an embedded GPU may save about 90 percent of energy consumption at the cost of doubling the application response time. But neither the save in energy cost nor the increase in response time is significant enough to impact the application. These findings suggest that it may be feasible to place model training workload on either workstation GPU or embedded GPU.

查看原文本刊更多论文

深度学习模型训练在嵌入式GPU设备上的适用性:实证研究

深度学习技术的广泛应用推动了嵌入式GPU设备和工作站GPU卡在当代工业物联网(IIoT)系统中的应用。由于两种GPU之间的本质差异，深度学习模型训练在其目前的实践中是在GPU卡上运行的，并且最多使用嵌入式GPU设备进行推理或部分模型训练。为了提供经验证据并帮助深度学习工作负载布局的决策，本文报告了一组关于每种GPU类型的时效性和能量效率的实验，同时运行卷积神经网络(CNN)和长短期记忆(LSTM)模型训练。结果表明，尽管响应时间较长，嵌入式gpu确实节省了总能源成本，但节能量在实际意义上可能并不显著。在本文中，我们报告了一个使用LSTM进行预测应用的案例研究。结果表明，相比之下，嵌入式GPU可以节省约90%的能耗，但代价是应用程序响应时间增加一倍。但是，无论是能源成本的节省还是响应时间的增加都不足以影响应用程序。这些发现表明，将模型训练工作量放在工作站GPU或嵌入式GPU上是可行的。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 12th Mediterranean Conference on Embedded Computing (MECO)

自引率

0.00%

发文量