Unsupervised Early Exit in DNNs with Multiple Exits

Proceedings of the Second International Conference on AI-ML Systems Pub Date : 2022-09-20 DOI:10.1145/3564121.3564137

U. HariNarayanN, M. Hanawal, Avinash Bhardwaj

{"title":"Unsupervised Early Exit in DNNs with Multiple Exits","authors":"U. HariNarayanN, M. Hanawal, Avinash Bhardwaj","doi":"10.1145/3564121.3564137","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNNs) are generally designed as sequentially cascaded differentiable blocks/layers with a prediction module connected only to its last layer. DNNs can be attached with prediction modules at multiple points along the backbone where inference can stop at an intermediary stage without passing through all the modules. The last exit point may offer a better prediction error but also involves more computational resources and latency. An exit point that is ‘optimal’ in terms of both prediction error and cost is desirable. The optimal exit point may depend on the latent distribution of the tasks and may change from one task type to another. During neural inference, the ground truth of instances may not be available and the error rates at each exit point cannot be estimated. Hence one is faced with the problem of selecting the optimal exit in an unsupervised setting. Prior works tackled this problem in an offline supervised setting assuming that enough labeled data is available to estimate the error rate at each exit point and tune the parameters for better accuracy. However, pre-trained DNNs are often deployed in new domains for which a large amount of ground truth may not be available. We thus model the problem of exit selection as an unsupervised online learning problem and leverage the bandit theory to identify the optimal exit point. Specifically, we focus on the Elastic BERT, a pre-trained multi-exit DNN to demonstrate that it ‘nearly’ satisfies the Strong Dominance (SD) property making it possible to learn the optimal exit in an online setup without knowing the ground truth labels. We develop upper confidence bound (UCB) based algorithm named UEE-UCB that provably achieves sub-linear regret under the SD property. Thus our method provides a means to adaptively learn domain-specific optimal exit points in multi-exit DNNs. We empirically validate our algorithm on IMDb and Yelp datasets.","PeriodicalId":166150,"journal":{"name":"Proceedings of the Second International Conference on AI-ML Systems","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Second International Conference on AI-ML Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3564121.3564137","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Deep Neural Networks (DNNs) are generally designed as sequentially cascaded differentiable blocks/layers with a prediction module connected only to its last layer. DNNs can be attached with prediction modules at multiple points along the backbone where inference can stop at an intermediary stage without passing through all the modules. The last exit point may offer a better prediction error but also involves more computational resources and latency. An exit point that is ‘optimal’ in terms of both prediction error and cost is desirable. The optimal exit point may depend on the latent distribution of the tasks and may change from one task type to another. During neural inference, the ground truth of instances may not be available and the error rates at each exit point cannot be estimated. Hence one is faced with the problem of selecting the optimal exit in an unsupervised setting. Prior works tackled this problem in an offline supervised setting assuming that enough labeled data is available to estimate the error rate at each exit point and tune the parameters for better accuracy. However, pre-trained DNNs are often deployed in new domains for which a large amount of ground truth may not be available. We thus model the problem of exit selection as an unsupervised online learning problem and leverage the bandit theory to identify the optimal exit point. Specifically, we focus on the Elastic BERT, a pre-trained multi-exit DNN to demonstrate that it ‘nearly’ satisfies the Strong Dominance (SD) property making it possible to learn the optimal exit in an online setup without knowing the ground truth labels. We develop upper confidence bound (UCB) based algorithm named UEE-UCB that provably achieves sub-linear regret under the SD property. Thus our method provides a means to adaptively learn domain-specific optimal exit points in multi-exit DNNs. We empirically validate our algorithm on IMDb and Yelp datasets.

查看原文本刊更多论文

多出口dnn的无监督提前退出

深度神经网络(dnn)通常被设计为顺序级联的可微块/层，预测模块仅连接到其最后一层。dnn可以在主干的多个点上附加预测模块，推理可以在中间阶段停止，而无需通过所有模块。最后一个退出点可能提供更好的预测误差，但也涉及更多的计算资源和延迟。一个在预测误差和成本方面都是“最优”的退出点是可取的。最优退出点可能取决于任务的潜在分布，并可能因任务类型的不同而变化。在神经推理过程中，实例的基础真值可能不可用，并且无法估计每个退出点的错误率。因此，人们面临着在无监督环境下选择最优出口的问题。先前的工作在离线监督设置中解决了这个问题，假设有足够的标记数据可用来估计每个出口点的错误率，并调整参数以获得更好的准确性。然而，预训练的dnn通常部署在可能无法获得大量基础真值的新领域。因此，我们将退出选择问题建模为无监督在线学习问题，并利用强盗理论来确定最优退出点。具体来说，我们关注弹性BERT，这是一个预训练的多出口深度神经网络，以证明它“几乎”满足强优势(SD)属性，使得在不知道基础真值标签的情况下学习在线设置中的最佳出口成为可能。我们提出了基于上置信度界(UCB)的UEE-UCB算法，该算法可证明在SD属性下实现亚线性遗憾。因此，我们的方法提供了一种在多出口深度神经网络中自适应学习特定领域的最佳出口点的方法。我们在IMDb和Yelp数据集上验证了我们的算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the Second International Conference on AI-ML Systems

自引率

0.00%

发文量