$\text{Edge}^{n}$ AI: Distributed Inference with Local Edge Devices and Minimal Latency

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2022-01-17 DOI:10.1109/ASP-DAC52403.2022.9712496

Maedeh Hemmat, A. Davoodi, Y. Hu

引用次数: 2

Abstract

We propose $\text{Edge}^{n}$ AI, a framework to decompose a complex deep neural networks (DNN) over $n$ available local edge devices with minimal communication overhead and overall latency. Our framework creates small DNNs (SNNs) from an original DNN by partitioning its classes across the edge devices, while taking into account their available resources. Class-aware pruning is applied to aggressively reduce the size of the SNN on each edge device. The SNNs perform inference in parallel, and are configured to generate a ‘Don't Know’ response when an unassigned class is identified. Our experiments show up to 17X inference speedup compared to a recent work, on devices of at most 150 MB memory when distributing a variant of VGG-16 over 20 parallel edge devices.

查看原文本刊更多论文

$\text{Edge}^{n}$ AI:基于本地边缘设备和最小延迟的分布式推理

我们提出$\text{Edge}^{n}$ AI，这是一个框架，用于在$n$可用的本地边缘设备上分解复杂的深度神经网络(DNN)，具有最小的通信开销和总体延迟。我们的框架通过在边缘设备上划分其类，同时考虑到它们的可用资源，从原始DNN创建小型DNN (snn)。类感知剪枝被应用于积极地减少每个边缘设备上SNN的大小。snn并行执行推理，并配置为在识别未分配的类时生成“不知道”响应。我们的实验显示，与最近的工作相比，在最多150 MB内存的设备上，当在20个并行边缘设备上分配VGG-16的变体时，推理速度提高了17倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)

自引率

0.00%

发文量