Towards Resource-aware DNN Partitioning for Edge Devices with Heterogeneous Resources

GLOBECOM 2022 - 2022 IEEE Global Communications Conference Pub Date : 2022-12-04 DOI:10.1109/GLOBECOM48099.2022.10000839

Muhammad Zawish, L. Abraham, K. Dev, Steven Davy

{"title":"Towards Resource-aware DNN Partitioning for Edge Devices with Heterogeneous Resources","authors":"Muhammad Zawish, L. Abraham, K. Dev, Steven Davy","doi":"10.1109/GLOBECOM48099.2022.10000839","DOIUrl":null,"url":null,"abstract":"Collaborative deep neural network (DNN) inference over edge and cloud is emerging as an effective approach for enabling several Internet of Things (IoT) applications. Edge devices are mainly resource-constrained and hence can not afford the computational complexity manifested by DNNs. Thereby, researchers have resorted to a collaborative computing approach, where a DNN is partitioned between edge and cloud. Recent art on DNN partitioning has either focused on bandwidth-specific partitioning or relied on offline benchmarking of DNN layers. However, edge devices are inherently heterogeneous and possess inconsistent levels and types of resources. Therefore, in this work, we propose a resource-aware partitioning of DNNs for accelerating collaborative inference over edge-cloud. The proposed approach provides the flexibility of partitioning a DNN with respect to the available nature and scale of resources for a certain edge device. Unlike state-of-the-art, we exploit different types of DNN complexities for partitioning them on heterogeneous edge devices. For example, in a bandwidth-constrained scenario, our approach gained 40% efficiency as compared to the offline benchmarking approach. Therefore, given the different nature of edge devices' computational, storage, and energy requirements, this approach provides a suitable configuration for edge-cloud synergetic inference.","PeriodicalId":313199,"journal":{"name":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM48099.2022.10000839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Collaborative deep neural network (DNN) inference over edge and cloud is emerging as an effective approach for enabling several Internet of Things (IoT) applications. Edge devices are mainly resource-constrained and hence can not afford the computational complexity manifested by DNNs. Thereby, researchers have resorted to a collaborative computing approach, where a DNN is partitioned between edge and cloud. Recent art on DNN partitioning has either focused on bandwidth-specific partitioning or relied on offline benchmarking of DNN layers. However, edge devices are inherently heterogeneous and possess inconsistent levels and types of resources. Therefore, in this work, we propose a resource-aware partitioning of DNNs for accelerating collaborative inference over edge-cloud. The proposed approach provides the flexibility of partitioning a DNN with respect to the available nature and scale of resources for a certain edge device. Unlike state-of-the-art, we exploit different types of DNN complexities for partitioning them on heterogeneous edge devices. For example, in a bandwidth-constrained scenario, our approach gained 40% efficiency as compared to the offline benchmarking approach. Therefore, given the different nature of edge devices' computational, storage, and energy requirements, this approach provides a suitable configuration for edge-cloud synergetic inference.

查看原文本刊更多论文

基于资源感知的异构边缘设备DNN分区研究

边缘和云上的协作深度神经网络(DNN)推理正在成为实现多种物联网(IoT)应用的有效方法。边缘设备主要是资源受限的，因此无法承受dnn所表现出的计算复杂性。因此，研究人员采用了一种协作计算方法，将深度神经网络划分为边缘和云。最近关于深度神经网络分区的研究要么集中在带宽特定的分区上，要么依赖于深度神经网络层的离线基准测试。然而，边缘设备本质上是异构的，并且拥有不一致的级别和类型的资源。因此，在这项工作中，我们提出了一种资源感知的dnn划分方法，以加速边缘云上的协同推理。所提出的方法提供了划分DNN的灵活性，相对于可用的性质和规模的资源为某一边缘设备。与最先进的技术不同，我们利用不同类型的DNN复杂性在异构边缘设备上对它们进行分区。例如，在带宽受限的场景中，与离线基准测试方法相比，我们的方法获得了40%的效率。因此，考虑到边缘设备的计算、存储和能量需求的不同性质，该方法为边缘云协同推理提供了合适的配置。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

GLOBECOM 2022 - 2022 IEEE Global Communications Conference

自引率

0.00%

发文量