AMVP: Adaptive CNN-based Multitask Video Processing on Mobile Stream Processing Platforms

M. Chao, R. Stoleru, Liuyi Jin, Shuochao Yao, Maxwell Maurice, R. Blalock
{"title":"AMVP: Adaptive CNN-based Multitask Video Processing on Mobile Stream Processing Platforms","authors":"M. Chao, R. Stoleru, Liuyi Jin, Shuochao Yao, Maxwell Maurice, R. Blalock","doi":"10.1109/SEC50012.2020.00015","DOIUrl":null,"url":null,"abstract":"The popularity of video cameras has spawned a new type of application called multitask video processing, which uses multiple CNNs to obtain different information of interests from a raw video stream. Unfortunately, the huge resource requirements of CNNs make the concurrent execution of multiple CNNs on a single resource-constrained mobile device challenging. Existing solutions solve this challenge by offloading CNN models to the cloud or edge server, compressing CNN models to fit the mobile device, or sharing some common parts of multiple CNN models. Most of these solutions, however, use the above offloading, compression or sharing strategies in a separate manner, which fail to adapt to the complex edge computing scenario well. In this paper, to solve the above limitation, we propose AMVP, an adaptive execution framework for CNN-based multitask video processing, which elegantly integrates the strategies of CNN layer sharing, feature compression, and model offloading. First, AMVP reduces the total computation workload of multiple CNN inference by sharing some common frozen CNN layers. Second, AMVP supports distributed CNN inference by splitting big CNNs into smaller components running on different devices. Third, AMVP leverages a quantization-based feature compression mechanism to reduce the feature transmission traffic size between two separate CNN components. We conduct extensive experiments on AMVP and the experimental results show that our AMVP framework can adapt to different performance goals and execution environments. Compared to two baseline approaches that only share or offload CNN layers, AMVP achieves up to 61% lower latency and 10% higher throughput with comparative accuracy.","PeriodicalId":375577,"journal":{"name":"2020 IEEE/ACM Symposium on Edge Computing (SEC)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM Symposium on Edge Computing (SEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SEC50012.2020.00015","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8

Abstract

The popularity of video cameras has spawned a new type of application called multitask video processing, which uses multiple CNNs to obtain different information of interests from a raw video stream. Unfortunately, the huge resource requirements of CNNs make the concurrent execution of multiple CNNs on a single resource-constrained mobile device challenging. Existing solutions solve this challenge by offloading CNN models to the cloud or edge server, compressing CNN models to fit the mobile device, or sharing some common parts of multiple CNN models. Most of these solutions, however, use the above offloading, compression or sharing strategies in a separate manner, which fail to adapt to the complex edge computing scenario well. In this paper, to solve the above limitation, we propose AMVP, an adaptive execution framework for CNN-based multitask video processing, which elegantly integrates the strategies of CNN layer sharing, feature compression, and model offloading. First, AMVP reduces the total computation workload of multiple CNN inference by sharing some common frozen CNN layers. Second, AMVP supports distributed CNN inference by splitting big CNNs into smaller components running on different devices. Third, AMVP leverages a quantization-based feature compression mechanism to reduce the feature transmission traffic size between two separate CNN components. We conduct extensive experiments on AMVP and the experimental results show that our AMVP framework can adapt to different performance goals and execution environments. Compared to two baseline approaches that only share or offload CNN layers, AMVP achieves up to 61% lower latency and 10% higher throughput with comparative accuracy.
移动流处理平台上基于cnn的自适应多任务视频处理
摄像机的普及催生了一种叫做多任务视频处理的新型应用,它使用多个cnn从原始视频流中获取不同的感兴趣的信息。不幸的是,cnn的巨大资源需求使得在单个资源受限的移动设备上并发执行多个cnn具有挑战性。现有的解决方案通过将CNN模型卸载到云或边缘服务器,压缩CNN模型以适应移动设备,或共享多个CNN模型的一些公共部分来解决这一挑战。然而,这些解决方案大多单独使用上述卸载、压缩或共享策略,不能很好地适应复杂的边缘计算场景。为了解决上述限制,本文提出了一种基于CNN的多任务视频处理自适应执行框架AMVP,该框架将CNN层共享、特征压缩和模型卸载策略巧妙地集成在一起。首先,AMVP通过共享一些常见的冻结CNN层,减少了多个CNN推理的总计算工作量。其次,AMVP通过将大型CNN拆分为运行在不同设备上的较小组件来支持分布式CNN推理。第三,AMVP利用基于量化的特征压缩机制来减少两个独立CNN组件之间的特征传输流量。我们对AMVP进行了大量的实验,实验结果表明我们的AMVP框架可以适应不同的性能目标和执行环境。与仅共享或卸载CNN层的两种基线方法相比,AMVP的延迟降低了61%,吞吐量提高了10%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信