Dispense Mode for Inference to Accelerate Branchynet

Zhiwei Liang, Yuezhi Zhou
{"title":"Dispense Mode for Inference to Accelerate Branchynet","authors":"Zhiwei Liang, Yuezhi Zhou","doi":"10.1109/ICIP46576.2022.9897574","DOIUrl":null,"url":null,"abstract":"With the increase of depth and width, Deep Neural Network has got the best results in the computer vision, but its massive calculation has brought a heavy burden to IOT devices. To speed up the inference of DNN models, Branchynet creatively puts forward the early exit, which means that samples exit from shallow layers to reduce the calculation amount of the model. But Branchynet has some unnecessary intermediate calculations in the inference process. We propose a dispense mode to solve this problem, which can optimize the accuracy and latency of BranchyNet at the same time. The dispense mode directly determines the exit position of the sample in the multi-branch network according to the difficulty of the sample without intermediate trial errors. Under the same accuracy requirements, the inference speed is improved by 30%-50% compared with the cascade mode of Branchynet. Moreover, while further reducing redundant calculation, it provides a method for dynamic adjustment of accuracy. Thus, our framework can easily adjust the accuracy of the model to meet higher throughputs.","PeriodicalId":387035,"journal":{"name":"2022 IEEE International Conference on Image Processing (ICIP)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP46576.2022.9897574","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

With the increase of depth and width, Deep Neural Network has got the best results in the computer vision, but its massive calculation has brought a heavy burden to IOT devices. To speed up the inference of DNN models, Branchynet creatively puts forward the early exit, which means that samples exit from shallow layers to reduce the calculation amount of the model. But Branchynet has some unnecessary intermediate calculations in the inference process. We propose a dispense mode to solve this problem, which can optimize the accuracy and latency of BranchyNet at the same time. The dispense mode directly determines the exit position of the sample in the multi-branch network according to the difficulty of the sample without intermediate trial errors. Under the same accuracy requirements, the inference speed is improved by 30%-50% compared with the cascade mode of Branchynet. Moreover, while further reducing redundant calculation, it provides a method for dynamic adjustment of accuracy. Thus, our framework can easily adjust the accuracy of the model to meet higher throughputs.
加速分支网络的推理分配模式
随着深度和宽度的增加,Deep Neural Network在计算机视觉中取得了最好的效果,但其庞大的计算量给物联网设备带来了沉重的负担。为了加快DNN模型的推理速度,Branchynet创造性地提出了早期退出,即样本从较浅的层退出,以减少模型的计算量。但Branchynet在推理过程中存在一些不必要的中间计算。我们提出了一种分配模式来解决这个问题,该模式可以同时优化分支网的准确性和延迟。分配方式根据样品的难易程度直接确定样品在多分支网络中的退出位置,没有中间试错。在相同精度要求下,与Branchynet的级联模式相比,推理速度提高了30% ~ 50%。在进一步减少冗余计算的同时,提供了动态调整精度的方法。因此,我们的框架可以很容易地调整模型的精度,以满足更高的吞吐量。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信