AMVP: Adaptive CNN-based Multitask Video Processing on Mobile Stream Processing Platforms

2020 IEEE/ACM Symposium on Edge Computing (SEC) Pub Date : 2020-11-01 DOI:10.1109/SEC50012.2020.00015

M. Chao, R. Stoleru, Liuyi Jin, Shuochao Yao, Maxwell Maurice, R. Blalock

{"title":"AMVP: Adaptive CNN-based Multitask Video Processing on Mobile Stream Processing Platforms","authors":"M. Chao, R. Stoleru, Liuyi Jin, Shuochao Yao, Maxwell Maurice, R. Blalock","doi":"10.1109/SEC50012.2020.00015","DOIUrl":null,"url":null,"abstract":"The popularity of video cameras has spawned a new type of application called multitask video processing, which uses multiple CNNs to obtain different information of interests from a raw video stream. Unfortunately, the huge resource requirements of CNNs make the concurrent execution of multiple CNNs on a single resource-constrained mobile device challenging. Existing solutions solve this challenge by offloading CNN models to the cloud or edge server, compressing CNN models to fit the mobile device, or sharing some common parts of multiple CNN models. Most of these solutions, however, use the above offloading, compression or sharing strategies in a separate manner, which fail to adapt to the complex edge computing scenario well. In this paper, to solve the above limitation, we propose AMVP, an adaptive execution framework for CNN-based multitask video processing, which elegantly integrates the strategies of CNN layer sharing, feature compression, and model offloading. First, AMVP reduces the total computation workload of multiple CNN inference by sharing some common frozen CNN layers. Second, AMVP supports distributed CNN inference by splitting big CNNs into smaller components running on different devices. Third, AMVP leverages a quantization-based feature compression mechanism to reduce the feature transmission traffic size between two separate CNN components. We conduct extensive experiments on AMVP and the experimental results show that our AMVP framework can adapt to different performance goals and execution environments. Compared to two baseline approaches that only share or offload CNN layers, AMVP achieves up to 61% lower latency and 10% higher throughput with comparative accuracy.","PeriodicalId":375577,"journal":{"name":"2020 IEEE/ACM Symposium on Edge Computing (SEC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM Symposium on Edge Computing (SEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SEC50012.2020.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 8

Abstract

The popularity of video cameras has spawned a new type of application called multitask video processing, which uses multiple CNNs to obtain different information of interests from a raw video stream. Unfortunately, the huge resource requirements of CNNs make the concurrent execution of multiple CNNs on a single resource-constrained mobile device challenging. Existing solutions solve this challenge by offloading CNN models to the cloud or edge server, compressing CNN models to fit the mobile device, or sharing some common parts of multiple CNN models. Most of these solutions, however, use the above offloading, compression or sharing strategies in a separate manner, which fail to adapt to the complex edge computing scenario well. In this paper, to solve the above limitation, we propose AMVP, an adaptive execution framework for CNN-based multitask video processing, which elegantly integrates the strategies of CNN layer sharing, feature compression, and model offloading. First, AMVP reduces the total computation workload of multiple CNN inference by sharing some common frozen CNN layers. Second, AMVP supports distributed CNN inference by splitting big CNNs into smaller components running on different devices. Third, AMVP leverages a quantization-based feature compression mechanism to reduce the feature transmission traffic size between two separate CNN components. We conduct extensive experiments on AMVP and the experimental results show that our AMVP framework can adapt to different performance goals and execution environments. Compared to two baseline approaches that only share or offload CNN layers, AMVP achieves up to 61% lower latency and 10% higher throughput with comparative accuracy.

查看原文本刊更多论文

移动流处理平台上基于cnn的自适应多任务视频处理

摄像机的普及催生了一种叫做多任务视频处理的新型应用，它使用多个cnn从原始视频流中获取不同的感兴趣的信息。不幸的是，cnn的巨大资源需求使得在单个资源受限的移动设备上并发执行多个cnn具有挑战性。现有的解决方案通过将CNN模型卸载到云或边缘服务器，压缩CNN模型以适应移动设备，或共享多个CNN模型的一些公共部分来解决这一挑战。然而，这些解决方案大多单独使用上述卸载、压缩或共享策略，不能很好地适应复杂的边缘计算场景。为了解决上述限制，本文提出了一种基于CNN的多任务视频处理自适应执行框架AMVP，该框架将CNN层共享、特征压缩和模型卸载策略巧妙地集成在一起。首先，AMVP通过共享一些常见的冻结CNN层，减少了多个CNN推理的总计算工作量。其次，AMVP通过将大型CNN拆分为运行在不同设备上的较小组件来支持分布式CNN推理。第三，AMVP利用基于量化的特征压缩机制来减少两个独立CNN组件之间的特征传输流量。我们对AMVP进行了大量的实验，实验结果表明我们的AMVP框架可以适应不同的性能目标和执行环境。与仅共享或卸载CNN层的两种基线方法相比，AMVP的延迟降低了61%，吞吐量提高了10%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2020 IEEE/ACM Symposium on Edge Computing (SEC)

自引率

0.00%

发文量