Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading

Proceedings of the 27th Annual International Conference on Mobile Computing and Networking Pub Date : 2021-09-09 DOI:10.1145/3447993.3448628

Wuyang Zhang, Zhezhi He, Luyang Liu, Zhenhua Jia, Yunxin Liu, M. Gruteser, D. Raychaudhuri, Yanyong Zhang

{"title":"Elf: accelerate high-resolution mobile deep vision with content-aware parallel offloading","authors":"Wuyang Zhang, Zhezhi He, Luyang Liu, Zhenhua Jia, Yunxin Liu, M. Gruteser, D. Raychaudhuri, Yanyong Zhang","doi":"10.1145/3447993.3448628","DOIUrl":null,"url":null,"abstract":"As mobile devices continuously generate streams of images and videos, a new class of mobile deep vision applications are rapidly emerging, which usually involve running deep neural networks on these multimedia data in real-time. To support such applications, having mobile devices offload the computation, especially the neural network inference, to edge clouds has proved effective. Existing solutions often assume there exists a dedicated and powerful server, to which the entire inference can be offloaded. In reality, however, we may not be able to find such a server but need to make do with less powerful ones. To address these more practical situations, we propose to partition the video frame and offload the partial inference tasks to multiple servers for parallel processing. This paper presents the design of Elf, a framework to accelerate the mobile deep vision applications with any server provisioning through the parallel offloading. Elf employs a recurrent region proposal prediction algorithm, a region proposal centric frame partitioning, and a resource-aware multi-offloading scheme. We implement and evaluate Elf upon Linux and Android platforms using four commercial mobile devices and three deep vision applications with ten state-of-the-art models. The comprehensive experiments show that Elf can speed up the applications by 4.85× with saving bandwidth usage by 52.6%, while with <1% application accuracy sacrifice.","PeriodicalId":177431,"journal":{"name":"Proceedings of the 27th Annual International Conference on Mobile Computing and Networking","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"75","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th Annual International Conference on Mobile Computing and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3447993.3448628","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 75

Abstract

As mobile devices continuously generate streams of images and videos, a new class of mobile deep vision applications are rapidly emerging, which usually involve running deep neural networks on these multimedia data in real-time. To support such applications, having mobile devices offload the computation, especially the neural network inference, to edge clouds has proved effective. Existing solutions often assume there exists a dedicated and powerful server, to which the entire inference can be offloaded. In reality, however, we may not be able to find such a server but need to make do with less powerful ones. To address these more practical situations, we propose to partition the video frame and offload the partial inference tasks to multiple servers for parallel processing. This paper presents the design of Elf, a framework to accelerate the mobile deep vision applications with any server provisioning through the parallel offloading. Elf employs a recurrent region proposal prediction algorithm, a region proposal centric frame partitioning, and a resource-aware multi-offloading scheme. We implement and evaluate Elf upon Linux and Android platforms using four commercial mobile devices and three deep vision applications with ten state-of-the-art models. The comprehensive experiments show that Elf can speed up the applications by 4.85× with saving bandwidth usage by 52.6%, while with <1% application accuracy sacrifice.

查看原文本刊更多论文

Elf:通过内容感知并行卸载加速高分辨率移动深度视觉

随着移动设备不断产生图像和视频流，一类新的移动深度视觉应用正在迅速出现，这些应用通常涉及在这些多媒体数据上实时运行深度神经网络。为了支持这样的应用程序，让移动设备卸载计算，特别是神经网络推理，到边缘云已被证明是有效的。现有的解决方案通常假设存在一个专用且功能强大的服务器，可以将整个推理卸载到该服务器上。然而，在现实中，我们可能无法找到这样的服务器，而需要使用功能较弱的服务器。为了解决这些更实际的情况，我们建议对视频帧进行分区，并将部分推理任务卸载到多个服务器上进行并行处理。本文设计了一个通过并行卸载来加速任意服务器配置的移动深度视觉应用程序的框架Elf。Elf采用循环区域建议预测算法、以区域建议为中心的帧划分和资源感知的多重卸载方案。我们在Linux和Android平台上使用四个商业移动设备和三个深度视觉应用程序以及十个最先进的模型来实现和评估Elf。综合实验表明，Elf可以将应用程序的速度提高4.85倍，节省52.6%的带宽使用，而应用程序的精度牺牲<1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the 27th Annual International Conference on Mobile Computing and Networking

自引率

0.00%

发文量