Applicability of Deep Learning Model Trainings on Embedded GPU Devices: An Empirical Study

Po-Hsuan Chou, Chao Wang, Chih-Shuo Mei
{"title":"Applicability of Deep Learning Model Trainings on Embedded GPU Devices: An Empirical Study","authors":"Po-Hsuan Chou, Chao Wang, Chih-Shuo Mei","doi":"10.1109/MECO58584.2023.10155048","DOIUrl":null,"url":null,"abstract":"The wide applications of deep learning techniques have motivated the inclusion of both embedded GPU devices and workstation GPU cards into contemporary Industrial Internet-of-Things (IIoT) systems. Due to substantial differences between the two types of GPUs, deep-learning model training in its current practice is run on GPU cards, and embedded GPU devices are used for inferences or partial model training at best. To supply with empirical evidence and aid the decision of deep learning workload placement, this paper reports a set of experiments on the timeliness and energy efficiency of each GPU type, running both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model training. The results suggest that embedded GPUs did save the total energy cost despite the longer response time, but the amount of energy saving might not be significant in a practical sense. Further in this paper we report a case study for prognostics applications using LSTM. The results suggest that, by comparison, an embedded GPU may save about 90 percent of energy consumption at the cost of doubling the application response time. But neither the save in energy cost nor the increase in response time is significant enough to impact the application. These findings suggest that it may be feasible to place model training workload on either workstation GPU or embedded GPU.","PeriodicalId":187825,"journal":{"name":"2023 12th Mediterranean Conference on Embedded Computing (MECO)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 12th Mediterranean Conference on Embedded Computing (MECO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MECO58584.2023.10155048","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

The wide applications of deep learning techniques have motivated the inclusion of both embedded GPU devices and workstation GPU cards into contemporary Industrial Internet-of-Things (IIoT) systems. Due to substantial differences between the two types of GPUs, deep-learning model training in its current practice is run on GPU cards, and embedded GPU devices are used for inferences or partial model training at best. To supply with empirical evidence and aid the decision of deep learning workload placement, this paper reports a set of experiments on the timeliness and energy efficiency of each GPU type, running both Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) model training. The results suggest that embedded GPUs did save the total energy cost despite the longer response time, but the amount of energy saving might not be significant in a practical sense. Further in this paper we report a case study for prognostics applications using LSTM. The results suggest that, by comparison, an embedded GPU may save about 90 percent of energy consumption at the cost of doubling the application response time. But neither the save in energy cost nor the increase in response time is significant enough to impact the application. These findings suggest that it may be feasible to place model training workload on either workstation GPU or embedded GPU.
深度学习模型训练在嵌入式GPU设备上的适用性:实证研究
深度学习技术的广泛应用推动了嵌入式GPU设备和工作站GPU卡在当代工业物联网(IIoT)系统中的应用。由于两种GPU之间的本质差异,深度学习模型训练在其目前的实践中是在GPU卡上运行的,并且最多使用嵌入式GPU设备进行推理或部分模型训练。为了提供经验证据并帮助深度学习工作负载布局的决策,本文报告了一组关于每种GPU类型的时效性和能量效率的实验,同时运行卷积神经网络(CNN)和长短期记忆(LSTM)模型训练。结果表明,尽管响应时间较长,嵌入式gpu确实节省了总能源成本,但节能量在实际意义上可能并不显著。在本文中,我们报告了一个使用LSTM进行预测应用的案例研究。结果表明,相比之下,嵌入式GPU可以节省约90%的能耗,但代价是应用程序响应时间增加一倍。但是,无论是能源成本的节省还是响应时间的增加都不足以影响应用程序。这些发现表明,将模型训练工作量放在工作站GPU或嵌入式GPU上是可行的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信