边缘计算中DNN模型缓存和处理器分配的在线方法

2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS) Pub Date : 2022-06-10 DOI:10.1109/IWQoS54832.2022.9812874

Zhiqi Chen, Shenmin Zhang, Zhi Ma, Shuai Zhang, Zhuzhong Qian, Mingjun Xiao, Jie Wu, Sanglu Lu

{"title":"边缘计算中DNN模型缓存和处理器分配的在线方法","authors":"Zhiqi Chen, Shenmin Zhang, Zhi Ma, Shuai Zhang, Zhuzhong Qian, Mingjun Xiao, Jie Wu, Sanglu Lu","doi":"10.1109/IWQoS54832.2022.9812874","DOIUrl":null,"url":null,"abstract":"Edge computing is a new computing paradigm rising gradually in recent years. Applications, such as object detection, virtual reality and intelligent cameras, often leverage Deep Neural Networks (DNN) inference technology. The traditional paradigm of DNN inference based on cloud suffers from high delay because of the limited bandwidth. From the perspective of service providers, caching DNN models on the edge brings several benefits, such as efficiency, privacy, security, etc.. The problem we concerned in this paper is how to decide the cached models and how to allocate processors of edge servers to reduce the overall system cost. To solve it, we model and study the DNN Model Caching and Processor Allocation (DMCPA) problem, which considers user-perceived delay and energy consumption with limited edge resources. We model it as an integer nonlinear programming (INLP) problem, and prove its NP-Completeness. Since it is considered as a long-term average optimization problem, we leverage the Lyapunov framework to develop a novel online algorithm DMCPA-GS-Online with Gibbs Sampling. We give the theoretical analysis to prove that our algorithm is near-optimal. In experiments, we study the performance of our algorithm and compare it with other baselines. The simulation results with the trace dataset from real world demonstrate the effectiveness and adaptiveness of our algorithm.","PeriodicalId":353365,"journal":{"name":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"An Online Approach for DNN Model Caching and Processor Allocation in Edge Computing\",\"authors\":\"Zhiqi Chen, Shenmin Zhang, Zhi Ma, Shuai Zhang, Zhuzhong Qian, Mingjun Xiao, Jie Wu, Sanglu Lu\",\"doi\":\"10.1109/IWQoS54832.2022.9812874\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Edge computing is a new computing paradigm rising gradually in recent years. Applications, such as object detection, virtual reality and intelligent cameras, often leverage Deep Neural Networks (DNN) inference technology. The traditional paradigm of DNN inference based on cloud suffers from high delay because of the limited bandwidth. From the perspective of service providers, caching DNN models on the edge brings several benefits, such as efficiency, privacy, security, etc.. The problem we concerned in this paper is how to decide the cached models and how to allocate processors of edge servers to reduce the overall system cost. To solve it, we model and study the DNN Model Caching and Processor Allocation (DMCPA) problem, which considers user-perceived delay and energy consumption with limited edge resources. We model it as an integer nonlinear programming (INLP) problem, and prove its NP-Completeness. Since it is considered as a long-term average optimization problem, we leverage the Lyapunov framework to develop a novel online algorithm DMCPA-GS-Online with Gibbs Sampling. We give the theoretical analysis to prove that our algorithm is near-optimal. In experiments, we study the performance of our algorithm and compare it with other baselines. The simulation results with the trace dataset from real world demonstrate the effectiveness and adaptiveness of our algorithm.\",\"PeriodicalId\":353365,\"journal\":{\"name\":\"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-10\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IWQoS54832.2022.9812874\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWQoS54832.2022.9812874","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

边缘计算是近年来逐渐兴起的一种新的计算范式。物体检测、虚拟现实和智能相机等应用通常利用深度神经网络(DNN)推理技术。传统的基于云的深度神经网络推理模式由于带宽有限，存在较高的延迟。从服务提供商的角度来看，在边缘缓存DNN模型带来了几个好处，比如效率、隐私、安全等。本文所关注的问题是如何确定缓存模型以及如何分配边缘服务器的处理器以降低系统的总体成本。为了解决这个问题，我们建模并研究了DNN模型缓存和处理器分配(DMCPA)问题，该问题考虑了用户感知延迟和有限边缘资源下的能量消耗。将其建模为整数非线性规划(INLP)问题，并证明了其np完备性。由于它被认为是一个长期的平均优化问题，我们利用李雅普诺夫框架开发了一种新的在线算法DMCPA-GS-Online与吉布斯采样。通过理论分析证明了该算法是接近最优的。在实验中，我们研究了算法的性能，并将其与其他基线进行了比较。仿真结果表明了该算法的有效性和自适应性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

An Online Approach for DNN Model Caching and Processor Allocation in Edge Computing

Edge computing is a new computing paradigm rising gradually in recent years. Applications, such as object detection, virtual reality and intelligent cameras, often leverage Deep Neural Networks (DNN) inference technology. The traditional paradigm of DNN inference based on cloud suffers from high delay because of the limited bandwidth. From the perspective of service providers, caching DNN models on the edge brings several benefits, such as efficiency, privacy, security, etc.. The problem we concerned in this paper is how to decide the cached models and how to allocate processors of edge servers to reduce the overall system cost. To solve it, we model and study the DNN Model Caching and Processor Allocation (DMCPA) problem, which considers user-perceived delay and energy consumption with limited edge resources. We model it as an integer nonlinear programming (INLP) problem, and prove its NP-Completeness. Since it is considered as a long-term average optimization problem, we leverage the Lyapunov framework to develop a novel online algorithm DMCPA-GS-Online with Gibbs Sampling. We give the theoretical analysis to prove that our algorithm is near-optimal. In experiments, we study the performance of our algorithm and compare it with other baselines. The simulation results with the trace dataset from real world demonstrate the effectiveness and adaptiveness of our algorithm.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 IEEE/ACM 30th International Symposium on Quality of Service (IWQoS)

自引率

0.00%

发文量