Evaluating the Performance of Deep Learning Inference Service on Edge Platform

Hyun-Hwa Choi, Jae-Geun Cha, Seung Hyun Yun, Dae-Won Kim, Sumin Jang, S. Kim
{"title":"Evaluating the Performance of Deep Learning Inference Service on Edge Platform","authors":"Hyun-Hwa Choi, Jae-Geun Cha, Seung Hyun Yun, Dae-Won Kim, Sumin Jang, S. Kim","doi":"10.1109/ICTC52510.2021.9620870","DOIUrl":null,"url":null,"abstract":"Deep learning inference requires tremendous amount of computation and typically is offloaded the cloud for execution. Recently, edge computing, which processes and stores data at the edge of the Internet closest to the mobile devices or sensors, has been considered as new computing paradigm. We have studied the performance of the deep neural network (DNN) inference service based on different configurations of resources assigned to a container. In this work, we measured and analyzed a real-world edge service on containerization platform. An edge service is named A!Eye, an application with various DNN inferences. The edge service has both CPU-friendly and GPU-friendly tasks. CPU tasks account for more than half of the latency of the edge service. Our analyses reveal interesting findings about running the DNN inference service on the container-based execution platform; (a) The latency of DNN inference-based edge services is affected by CPU-based operation performance. (b) Pinning CPUs can reduce the latency of an edge service. (c) In order to improve the performance of an edge service, it is very important to avoid PCIe bottleneck shared by resources like CPUs, GPUs and NICs.","PeriodicalId":299175,"journal":{"name":"2021 International Conference on Information and Communication Technology Convergence (ICTC)","volume":"20 80 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 International Conference on Information and Communication Technology Convergence (ICTC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTC52510.2021.9620870","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Deep learning inference requires tremendous amount of computation and typically is offloaded the cloud for execution. Recently, edge computing, which processes and stores data at the edge of the Internet closest to the mobile devices or sensors, has been considered as new computing paradigm. We have studied the performance of the deep neural network (DNN) inference service based on different configurations of resources assigned to a container. In this work, we measured and analyzed a real-world edge service on containerization platform. An edge service is named A!Eye, an application with various DNN inferences. The edge service has both CPU-friendly and GPU-friendly tasks. CPU tasks account for more than half of the latency of the edge service. Our analyses reveal interesting findings about running the DNN inference service on the container-based execution platform; (a) The latency of DNN inference-based edge services is affected by CPU-based operation performance. (b) Pinning CPUs can reduce the latency of an edge service. (c) In order to improve the performance of an edge service, it is very important to avoid PCIe bottleneck shared by resources like CPUs, GPUs and NICs.
边缘平台上深度学习推理服务的性能评估
深度学习推理需要大量的计算,通常会卸载到云上执行。最近,边缘计算被认为是一种新的计算范式,它是在最靠近移动设备或传感器的互联网边缘处理和存储数据。我们研究了基于分配给容器的不同资源配置的深度神经网络(DNN)推理服务性能。在这项工作中,我们在集装箱化平台上测量和分析了一个真实的边缘服务。边缘服务命名为A!Eye,一个具有各种DNN推理的应用程序。边缘服务同时具有cpu友好型和gpu友好型任务。CPU任务占边缘服务延迟的一半以上。我们的分析揭示了在基于容器的执行平台上运行DNN推理服务的有趣发现;(a)基于DNN推理的边缘服务时延受基于cpu的操作性能影响。(b)固定cpu可以减少边缘服务的延迟。(c)为了提高边缘服务的性能,避免cpu、gpu和网卡等资源共享PCIe瓶颈是非常重要的。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信