LegoDNN: block-grained scaling of deep neural networks for mobile vision

Rui Han, Qinglong Zhang, C. Liu, Guoren Wang, Jian Tang, L. Chen
{"title":"LegoDNN: block-grained scaling of deep neural networks for mobile vision","authors":"Rui Han, Qinglong Zhang, C. Liu, Guoren Wang, Jian Tang, L. Chen","doi":"10.1145/3447993.3483249","DOIUrl":null,"url":null,"abstract":"Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-the-art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.","PeriodicalId":177431,"journal":{"name":"Proceedings of the 27th Annual International Conference on Mobile Computing and Networking","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 27th Annual International Conference on Mobile Computing and Networking","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3447993.3483249","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

Abstract

Deep neural networks (DNNs) have become ubiquitous techniques in mobile and embedded systems for applications such as image/object recognition and classification. The trend of executing multiple DNNs simultaneously exacerbate the existing limitations of meeting stringent latency/accuracy requirements on resource constrained mobile devices. The prior art sheds light on exploring the accuracy-resource tradeoff by scaling the model sizes in accordance to resource dynamics. However, such model scaling approaches face to imminent challenges: (i) large space exploration of model sizes, and (ii) prohibitively long training time for different model combinations. In this paper, we present LegoDNN, a lightweight, block-grained scaling solution for running multi-DNN workloads in mobile vision systems. LegoDNN guarantees short model training times by only extracting and training a small number of common blocks (e.g. 5 in VGG and 8 in ResNet) in a DNN. At run-time, LegoDNN optimally combines the descendant models of these blocks to maximize accuracy under specific resources and latency constraints, while reducing switching overhead via smart block-level scaling of the DNN. We implement LegoDNN in TensorFlow Lite and extensively evaluate it against state-of-the-art techniques (FLOP scaling, knowledge distillation and model compression) using a set of 12 popular DNN models. Evaluation results show that LegoDNN provides 1,296x to 279,936x more options in model sizes without increasing training time, thus achieving as much as 31.74% improvement in inference accuracy and 71.07% reduction in scaling energy consumptions.
LegoDNN:用于移动视觉的深度神经网络的块粒度缩放
深度神经网络(dnn)已经成为移动和嵌入式系统中无处不在的技术,用于图像/物体识别和分类等应用。同时执行多个dnn的趋势加剧了在资源受限的移动设备上满足严格的延迟/精度要求的现有限制。现有技术通过根据资源动态缩放模型大小来探索准确性-资源权衡。然而,这种模型缩放方法面临着迫在眉睫的挑战:(i)模型尺寸的大空间探索,以及(ii)不同模型组合的训练时间过长。在本文中,我们提出了LegoDNN,一种在移动视觉系统中运行多dnn工作负载的轻量级、块粒度缩放解决方案。LegoDNN通过在DNN中只提取和训练少量的公共块(例如VGG中的5个和ResNet中的8个)来保证较短的模型训练时间。在运行时,LegoDNN优化地结合了这些块的后代模型,以在特定资源和延迟约束下最大化准确性,同时通过DNN的智能块级扩展减少切换开销。我们在TensorFlow Lite中实现了lego - nn,并使用一组12个流行的DNN模型对最先进的技术(FLOP缩放、知识蒸馏和模型压缩)进行了广泛的评估。评估结果表明,在不增加训练时间的情况下,LegoDNN在模型大小上多提供了1,296 ~ 279,936倍的选项,从而实现了高达31.74%的推理精度提高和71.07%的缩放能耗降低。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信