$\text{Edge}^{n}$ AI:基于本地边缘设备和最小延迟的分布式推理

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC) Pub Date : 2022-01-17 DOI:10.1109/ASP-DAC52403.2022.9712496

Maedeh Hemmat, A. Davoodi, Y. Hu

{"title":"$\\text{Edge}^{n}$ AI:基于本地边缘设备和最小延迟的分布式推理","authors":"Maedeh Hemmat, A. Davoodi, Y. Hu","doi":"10.1109/ASP-DAC52403.2022.9712496","DOIUrl":null,"url":null,"abstract":"We propose $\\text{Edge}^{n}$ AI, a framework to decompose a complex deep neural networks (DNN) over $n$ available local edge devices with minimal communication overhead and overall latency. Our framework creates small DNNs (SNNs) from an original DNN by partitioning its classes across the edge devices, while taking into account their available resources. Class-aware pruning is applied to aggressively reduce the size of the SNN on each edge device. The SNNs perform inference in parallel, and are configured to generate a ‘Don't Know’ response when an unassigned class is identified. Our experiments show up to 17X inference speedup compared to a recent work, on devices of at most 150 MB memory when distributing a variant of VGG-16 over 20 parallel edge devices.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"$\\\\text{Edge}^{n}$ AI: Distributed Inference with Local Edge Devices and Minimal Latency\",\"authors\":\"Maedeh Hemmat, A. Davoodi, Y. Hu\",\"doi\":\"10.1109/ASP-DAC52403.2022.9712496\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose $\\\\text{Edge}^{n}$ AI, a framework to decompose a complex deep neural networks (DNN) over $n$ available local edge devices with minimal communication overhead and overall latency. Our framework creates small DNNs (SNNs) from an original DNN by partitioning its classes across the edge devices, while taking into account their available resources. Class-aware pruning is applied to aggressively reduce the size of the SNN on each edge device. The SNNs perform inference in parallel, and are configured to generate a ‘Don't Know’ response when an unassigned class is identified. Our experiments show up to 17X inference speedup compared to a recent work, on devices of at most 150 MB memory when distributing a variant of VGG-16 over 20 parallel edge devices.\",\"PeriodicalId\":239260,\"journal\":{\"name\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ASP-DAC52403.2022.9712496\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASP-DAC52403.2022.9712496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

摘要

我们提出$\text{Edge}^{n}$ AI，这是一个框架，用于在$n$可用的本地边缘设备上分解复杂的深度神经网络(DNN)，具有最小的通信开销和总体延迟。我们的框架通过在边缘设备上划分其类，同时考虑到它们的可用资源，从原始DNN创建小型DNN (snn)。类感知剪枝被应用于积极地减少每个边缘设备上SNN的大小。snn并行执行推理，并配置为在识别未分配的类时生成“不知道”响应。我们的实验显示，与最近的工作相比，在最多150 MB内存的设备上，当在20个并行边缘设备上分配VGG-16的变体时，推理速度提高了17倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

$\text{Edge}^{n}$ AI: Distributed Inference with Local Edge Devices and Minimal Latency

We propose $\text{Edge}^{n}$ AI, a framework to decompose a complex deep neural networks (DNN) over $n$ available local edge devices with minimal communication overhead and overall latency. Our framework creates small DNNs (SNNs) from an original DNN by partitioning its classes across the edge devices, while taking into account their available resources. Class-aware pruning is applied to aggressively reduce the size of the SNN on each edge device. The SNNs perform inference in parallel, and are configured to generate a ‘Don't Know’ response when an unassigned class is identified. Our experiments show up to 17X inference speedup compared to a recent work, on devices of at most 150 MB memory when distributing a variant of VGG-16 over 20 parallel edge devices.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)

自引率

0.00%

发文量