Cut, Distil and Encode (CDE): Split Cloud-Edge Deep Inference

2021 18th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON) Pub Date : 2021-07-06 DOI:10.1109/SECON52354.2021.9491600

Marion Sbai, Muhamad Risqi U. Saputra, Niki Trigoni, A. Markham

{"title":"Cut, Distil and Encode (CDE): Split Cloud-Edge Deep Inference","authors":"Marion Sbai, Muhamad Risqi U. Saputra, Niki Trigoni, A. Markham","doi":"10.1109/SECON52354.2021.9491600","DOIUrl":null,"url":null,"abstract":"In cloud-edge environments, running all Deep Neural Network (DNN) models on the cloud causes significant network congestion and high latency, whereas the exclusive use of the edge device for execution limits the size and structure of the DNN, impacting accuracy. This paper introduces a novel partitioning approach for DNN inference between the edge and the cloud. This is the first work to consider simultaneous optimization of both the memory usage at the edge and the size of the data to be transferred over the wireless link. The experiments were performed on two different network architectures, MobileNetV1 and VGG16. The proposed approach makes it possible to execute part of the network on very constrained devices (e.g., microcontrollers), and under poor network conditions (e.g., LoRa) whilst retaining reasonable accuracies. Moreover, the results show that the choice of the optimal layer to split the network depends on the bandwidth and memory constraints, whereas prior work suggests that the best choice is always to split the network at higher layers. We demonstrate superior performance compared to existing techniques.","PeriodicalId":120945,"journal":{"name":"2021 18th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 18th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SECON52354.2021.9491600","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 12

Abstract

In cloud-edge environments, running all Deep Neural Network (DNN) models on the cloud causes significant network congestion and high latency, whereas the exclusive use of the edge device for execution limits the size and structure of the DNN, impacting accuracy. This paper introduces a novel partitioning approach for DNN inference between the edge and the cloud. This is the first work to consider simultaneous optimization of both the memory usage at the edge and the size of the data to be transferred over the wireless link. The experiments were performed on two different network architectures, MobileNetV1 and VGG16. The proposed approach makes it possible to execute part of the network on very constrained devices (e.g., microcontrollers), and under poor network conditions (e.g., LoRa) whilst retaining reasonable accuracies. Moreover, the results show that the choice of the optimal layer to split the network depends on the bandwidth and memory constraints, whereas prior work suggests that the best choice is always to split the network at higher layers. We demonstrate superior performance compared to existing techniques.

查看原文本刊更多论文

切割，提取和编码(CDE):分裂云边缘深度推理

在云边缘环境中，在云上运行所有深度神经网络(DNN)模型会导致严重的网络拥塞和高延迟，而仅使用边缘设备执行会限制DNN的大小和结构，从而影响准确性。本文介绍了一种基于边缘和云的深度神经网络划分方法。这是第一个考虑同时优化边缘内存使用和通过无线链路传输的数据大小的工作。实验分别在MobileNetV1和VGG16两种不同的网络架构上进行。所提出的方法使得在非常受限的设备(例如，微控制器)和恶劣的网络条件下(例如，LoRa)执行部分网络成为可能，同时保持合理的精度。此外，研究结果表明，网络分割的最佳层的选择取决于带宽和内存约束，而先前的研究表明，最佳选择总是在更高层进行网络分割。与现有技术相比，我们展示了优越的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 18th Annual IEEE International Conference on Sensing, Communication, and Networking (SECON)

自引率

0.00%

发文量