$\text{Edge}^{n}$ AI: Distributed Inference with Local Edge Devices and Minimal Latency

Maedeh Hemmat, A. Davoodi, Y. Hu
{"title":"$\\text{Edge}^{n}$ AI: Distributed Inference with Local Edge Devices and Minimal Latency","authors":"Maedeh Hemmat, A. Davoodi, Y. Hu","doi":"10.1109/ASP-DAC52403.2022.9712496","DOIUrl":null,"url":null,"abstract":"We propose $\\text{Edge}^{n}$ AI, a framework to decompose a complex deep neural networks (DNN) over $n$ available local edge devices with minimal communication overhead and overall latency. Our framework creates small DNNs (SNNs) from an original DNN by partitioning its classes across the edge devices, while taking into account their available resources. Class-aware pruning is applied to aggressively reduce the size of the SNN on each edge device. The SNNs perform inference in parallel, and are configured to generate a ‘Don't Know’ response when an unassigned class is identified. Our experiments show up to 17X inference speedup compared to a recent work, on devices of at most 150 MB memory when distributing a variant of VGG-16 over 20 parallel edge devices.","PeriodicalId":239260,"journal":{"name":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 27th Asia and South Pacific Design Automation Conference (ASP-DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ASP-DAC52403.2022.9712496","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

We propose $\text{Edge}^{n}$ AI, a framework to decompose a complex deep neural networks (DNN) over $n$ available local edge devices with minimal communication overhead and overall latency. Our framework creates small DNNs (SNNs) from an original DNN by partitioning its classes across the edge devices, while taking into account their available resources. Class-aware pruning is applied to aggressively reduce the size of the SNN on each edge device. The SNNs perform inference in parallel, and are configured to generate a ‘Don't Know’ response when an unassigned class is identified. Our experiments show up to 17X inference speedup compared to a recent work, on devices of at most 150 MB memory when distributing a variant of VGG-16 over 20 parallel edge devices.
$\text{Edge}^{n}$ AI:基于本地边缘设备和最小延迟的分布式推理
我们提出$\text{Edge}^{n}$ AI,这是一个框架,用于在$n$可用的本地边缘设备上分解复杂的深度神经网络(DNN),具有最小的通信开销和总体延迟。我们的框架通过在边缘设备上划分其类,同时考虑到它们的可用资源,从原始DNN创建小型DNN (snn)。类感知剪枝被应用于积极地减少每个边缘设备上SNN的大小。snn并行执行推理,并配置为在识别未分配的类时生成“不知道”响应。我们的实验显示,与最近的工作相比,在最多150 MB内存的设备上,当在20个并行边缘设备上分配VGG-16的变体时,推理速度提高了17倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信