Dispense Mode for Inference to Accelerate Branchynet

2022 IEEE International Conference on Image Processing (ICIP) Pub Date : 2022-10-16 DOI:10.1109/ICIP46576.2022.9897574

Zhiwei Liang, Yuezhi Zhou

引用次数: 2

Abstract

With the increase of depth and width, Deep Neural Network has got the best results in the computer vision, but its massive calculation has brought a heavy burden to IOT devices. To speed up the inference of DNN models, Branchynet creatively puts forward the early exit, which means that samples exit from shallow layers to reduce the calculation amount of the model. But Branchynet has some unnecessary intermediate calculations in the inference process. We propose a dispense mode to solve this problem, which can optimize the accuracy and latency of BranchyNet at the same time. The dispense mode directly determines the exit position of the sample in the multi-branch network according to the difficulty of the sample without intermediate trial errors. Under the same accuracy requirements, the inference speed is improved by 30%-50% compared with the cascade mode of Branchynet. Moreover, while further reducing redundant calculation, it provides a method for dynamic adjustment of accuracy. Thus, our framework can easily adjust the accuracy of the model to meet higher throughputs.

查看原文本刊更多论文

加速分支网络的推理分配模式

随着深度和宽度的增加，Deep Neural Network在计算机视觉中取得了最好的效果，但其庞大的计算量给物联网设备带来了沉重的负担。为了加快DNN模型的推理速度，Branchynet创造性地提出了早期退出，即样本从较浅的层退出，以减少模型的计算量。但Branchynet在推理过程中存在一些不必要的中间计算。我们提出了一种分配模式来解决这个问题，该模式可以同时优化分支网的准确性和延迟。分配方式根据样品的难易程度直接确定样品在多分支网络中的退出位置，没有中间试错。在相同精度要求下，与Branchynet的级联模式相比，推理速度提高了30% ~ 50%。在进一步减少冗余计算的同时，提供了动态调整精度的方法。因此，我们的框架可以很容易地调整模型的精度，以满足更高的吞吐量。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Conference on Image Processing (ICIP)

自引率

0.00%

发文量