All-you-can-inference: serverless DNN model inference suite

Subin Park, J. Choi, Kyungyong Lee
{"title":"All-you-can-inference: serverless DNN model inference suite","authors":"Subin Park, J. Choi, Kyungyong Lee","doi":"10.1145/3565382.3565878","DOIUrl":null,"url":null,"abstract":"Serverless computing becomes prevalent and is widely adopted for various applications. Deep learning inference tasks are appropriate to be deployed using a serverless architecture due to the nature of fluctuating task arrival events. When serving a Deep Neural Net (DNN) model in a serverless computing environment, there exist many performance optimization opportunities, including various hardware support, model graph optimization, hardware-agnostic model compilation, memory size and batch size configurations, and many others. Although the serverless computing frees users from cloud resource management overhead, it is still very challenging to find an optimal serverless DNN inference environment among a very large optimization opportunities for the configurations. In this work, we propose All-You-Can-Inference (AYCI), which helps users to find an optimally operating DNN inference in a publicly available serverless computing environment. We have built the proposed system as a service using various fully-managed cloud services and open-sourced the system to help DNN application developers to build an optimal serving environment. The prototype implementation and initial experiment result present the difficulty of finding an optimal DNN inference environment with the varying performance.","PeriodicalId":132528,"journal":{"name":"Proceedings of the Eighth International Workshop on Serverless Computing","volume":"46 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Eighth International Workshop on Serverless Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3565382.3565878","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Serverless computing becomes prevalent and is widely adopted for various applications. Deep learning inference tasks are appropriate to be deployed using a serverless architecture due to the nature of fluctuating task arrival events. When serving a Deep Neural Net (DNN) model in a serverless computing environment, there exist many performance optimization opportunities, including various hardware support, model graph optimization, hardware-agnostic model compilation, memory size and batch size configurations, and many others. Although the serverless computing frees users from cloud resource management overhead, it is still very challenging to find an optimal serverless DNN inference environment among a very large optimization opportunities for the configurations. In this work, we propose All-You-Can-Inference (AYCI), which helps users to find an optimally operating DNN inference in a publicly available serverless computing environment. We have built the proposed system as a service using various fully-managed cloud services and open-sourced the system to help DNN application developers to build an optimal serving environment. The prototype implementation and initial experiment result present the difficulty of finding an optimal DNN inference environment with the varying performance.
All-you-can-inference:无服务器DNN模型推理套件
无服务器计算越来越流行,并被广泛应用于各种应用程序。由于任务到达事件的波动性,深度学习推理任务适合使用无服务器架构进行部署。在无服务器计算环境中为深度神经网络(DNN)模型提供服务时,存在许多性能优化机会,包括各种硬件支持、模型图优化、与硬件无关的模型编译、内存大小和批处理大小配置等。尽管无服务器计算将用户从云资源管理开销中解放出来,但在配置的大量优化机会中找到最佳的无服务器DNN推理环境仍然非常具有挑战性。在这项工作中,我们提出了All-You-Can-Inference (AYCI),它可以帮助用户在公开可用的无服务器计算环境中找到最佳运行的DNN推理。我们使用各种完全托管的云服务和开源系统来构建提议的系统作为服务,以帮助DNN应用程序开发人员构建最佳的服务环境。原型实现和初步实验结果表明,在性能变化的情况下,很难找到最优的深度神经网络推理环境。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信