Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems

2023 IEEE 24th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM) Pub Date : 2023-06-01 DOI:10.1109/WoWMoM57956.2023.00014

Juliano S. Assine, José Cândido Silveira Santos Filho, Eduardo Valle, M. Levorato

{"title":"Slimmable Encoders for Flexible Split DNNs in Bandwidth and Resource Constrained IoT Systems","authors":"Juliano S. Assine, José Cândido Silveira Santos Filho, Eduardo Valle, M. Levorato","doi":"10.1109/WoWMoM57956.2023.00014","DOIUrl":null,"url":null,"abstract":"The execution of large deep neural networks (DNN) at mobile edge devices requires considerable consumption of critical resources, such as energy, while imposing demands on hardware capabilities. In approaches based on edge computing the execution of the models is offloaded to a compute-capable device positioned at the edge of 5G infrastructures. The main issue of the latter class of approaches is the need to transport information-rich signals over wireless links with limited and time-varying capacity. The recent split computing paradigm attempts to resolve this impasse by distributing the execution of DNN models across the layers of the systems to reduce the amount of data to be transmitted while imposing minimal computing load on mobile devices. In this context, we propose a novel split computing approach based on slimmable ensemble encoders. The key advantage of our design is the ability to adapt computational load and transmitted data size in real-time with minimal overhead and time. This is in contrast with existing approaches, where the same adaptation requires costly context switching and model loading. Moreover, our model outperforms existing solutions in terms of compression efficacy and execution time, especially in the context of weak mobile devices. We present a comprehensive comparison with the most advanced split computing solutions, as well as an experimental evaluation on GPU-less devices.","PeriodicalId":132845,"journal":{"name":"2023 IEEE 24th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 IEEE 24th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WoWMoM57956.2023.00014","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

The execution of large deep neural networks (DNN) at mobile edge devices requires considerable consumption of critical resources, such as energy, while imposing demands on hardware capabilities. In approaches based on edge computing the execution of the models is offloaded to a compute-capable device positioned at the edge of 5G infrastructures. The main issue of the latter class of approaches is the need to transport information-rich signals over wireless links with limited and time-varying capacity. The recent split computing paradigm attempts to resolve this impasse by distributing the execution of DNN models across the layers of the systems to reduce the amount of data to be transmitted while imposing minimal computing load on mobile devices. In this context, we propose a novel split computing approach based on slimmable ensemble encoders. The key advantage of our design is the ability to adapt computational load and transmitted data size in real-time with minimal overhead and time. This is in contrast with existing approaches, where the same adaptation requires costly context switching and model loading. Moreover, our model outperforms existing solutions in terms of compression efficacy and execution time, especially in the context of weak mobile devices. We present a comprehensive comparison with the most advanced split computing solutions, as well as an experimental evaluation on GPU-less devices.

查看原文本刊更多论文

用于带宽和资源受限物联网系统中灵活分割dnn的纤薄编码器

在移动边缘设备上执行大型深度神经网络(DNN)需要大量消耗关键资源，如能源，同时对硬件能力提出了要求。在基于边缘计算的方法中，模型的执行被卸载到位于5G基础设施边缘的具有计算能力的设备上。后一类方法的主要问题是需要在容量有限且随时间变化的无线链路上传输信息丰富的信号。最近的拆分计算范式试图通过在系统的各个层上分配DNN模型的执行来解决这一僵局，以减少要传输的数据量，同时对移动设备施加最小的计算负载。在此背景下，我们提出了一种基于可瘦身集成编码器的拆分计算方法。我们设计的主要优点是能够以最小的开销和时间实时适应计算负载和传输数据大小。这与现有的方法形成对比，在现有的方法中，相同的适应需要代价高昂的上下文切换和模型加载。此外，我们的模型在压缩效率和执行时间方面优于现有的解决方案，特别是在弱移动设备的背景下。我们提出了一个全面的比较与最先进的分裂计算解决方案，以及在无gpu设备上的实验评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2023 IEEE 24th International Symposium on a World of Wireless, Mobile and Multimedia Networks (WoWMoM)

自引率

0.00%

发文量