{"title":"Towards Resource-aware DNN Partitioning for Edge Devices with Heterogeneous Resources","authors":"Muhammad Zawish, L. Abraham, K. Dev, Steven Davy","doi":"10.1109/GLOBECOM48099.2022.10000839","DOIUrl":null,"url":null,"abstract":"Collaborative deep neural network (DNN) inference over edge and cloud is emerging as an effective approach for enabling several Internet of Things (IoT) applications. Edge devices are mainly resource-constrained and hence can not afford the computational complexity manifested by DNNs. Thereby, researchers have resorted to a collaborative computing approach, where a DNN is partitioned between edge and cloud. Recent art on DNN partitioning has either focused on bandwidth-specific partitioning or relied on offline benchmarking of DNN layers. However, edge devices are inherently heterogeneous and possess inconsistent levels and types of resources. Therefore, in this work, we propose a resource-aware partitioning of DNNs for accelerating collaborative inference over edge-cloud. The proposed approach provides the flexibility of partitioning a DNN with respect to the available nature and scale of resources for a certain edge device. Unlike state-of-the-art, we exploit different types of DNN complexities for partitioning them on heterogeneous edge devices. For example, in a bandwidth-constrained scenario, our approach gained 40% efficiency as compared to the offline benchmarking approach. Therefore, given the different nature of edge devices' computational, storage, and energy requirements, this approach provides a suitable configuration for edge-cloud synergetic inference.","PeriodicalId":313199,"journal":{"name":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"GLOBECOM 2022 - 2022 IEEE Global Communications Conference","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/GLOBECOM48099.2022.10000839","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Collaborative deep neural network (DNN) inference over edge and cloud is emerging as an effective approach for enabling several Internet of Things (IoT) applications. Edge devices are mainly resource-constrained and hence can not afford the computational complexity manifested by DNNs. Thereby, researchers have resorted to a collaborative computing approach, where a DNN is partitioned between edge and cloud. Recent art on DNN partitioning has either focused on bandwidth-specific partitioning or relied on offline benchmarking of DNN layers. However, edge devices are inherently heterogeneous and possess inconsistent levels and types of resources. Therefore, in this work, we propose a resource-aware partitioning of DNNs for accelerating collaborative inference over edge-cloud. The proposed approach provides the flexibility of partitioning a DNN with respect to the available nature and scale of resources for a certain edge device. Unlike state-of-the-art, we exploit different types of DNN complexities for partitioning them on heterogeneous edge devices. For example, in a bandwidth-constrained scenario, our approach gained 40% efficiency as compared to the offline benchmarking approach. Therefore, given the different nature of edge devices' computational, storage, and energy requirements, this approach provides a suitable configuration for edge-cloud synergetic inference.