EdgeBoost: Confidence boosting for resource constrained inference via selective offloading

IF 4.4 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Computer Networks Pub Date : 2025-06-16 DOI:10.1016/j.comnet.2025.111437

Naina Said , Olaf Landsiedel

{"title":"EdgeBoost: Confidence boosting for resource constrained inference via selective offloading","authors":"Naina Said , Olaf Landsiedel","doi":"10.1016/j.comnet.2025.111437","DOIUrl":null,"url":null,"abstract":"<div><div>Deploying large Deep Neural Networks with state-of-the-art accuracy on edge devices is often impractical due to their limited resources. This paper introduces <span>EdgeBoost</span>, a selective input offloading system designed to overcome the challenges of limited computational resources on edge devices. <span>EdgeBoost</span> trains and calibrates a lightweight model for deployment on the edge and, in addition, deploys a large, complex model on the cloud. During inference, the edge model makes initial predictions for input samples, and if the confidence of the prediction is low, the sample is sent to the cloud model for further processing, otherwise, we accept the local prediction. Through careful calibration, <span>EdgeBoost</span> reduces the communication cost by 55%, 27% and 20% for the CIFAR-100, ImageNet-1k and Stanford Cars datasets, respectively, when compared to an cloud-only solution while achieving on-par classification accuracy. Furthermore, <span>EdgeBoost</span> reduces the total inference latency from 148 ms to 123.84 ms per inference compared to a cloud-only solution. Our evaluation also shows that calibrating the edge model for such a collaborative edge–cloud setup results in accuracy gains of up to 8 percent point, compared to an uncalibrated edge model. Additionally, EdgeBoost, when used as an abstaining classifier, can improve accuracy by up to 9 percent points over an uncalibrated model. Finally, <span>EdgeBoost</span> outperforms the Early Exit and Entropy thresholding baselines and achieves comparable accuracy to state-of-the-art routing-based methods without the need for hosting the router on the edge.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"269 ","pages":"Article 111437"},"PeriodicalIF":4.4000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625004049","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Deploying large Deep Neural Networks with state-of-the-art accuracy on edge devices is often impractical due to their limited resources. This paper introduces EdgeBoost, a selective input offloading system designed to overcome the challenges of limited computational resources on edge devices. EdgeBoost trains and calibrates a lightweight model for deployment on the edge and, in addition, deploys a large, complex model on the cloud. During inference, the edge model makes initial predictions for input samples, and if the confidence of the prediction is low, the sample is sent to the cloud model for further processing, otherwise, we accept the local prediction. Through careful calibration, EdgeBoost reduces the communication cost by 55%, 27% and 20% for the CIFAR-100, ImageNet-1k and Stanford Cars datasets, respectively, when compared to an cloud-only solution while achieving on-par classification accuracy. Furthermore, EdgeBoost reduces the total inference latency from 148 ms to 123.84 ms per inference compared to a cloud-only solution. Our evaluation also shows that calibrating the edge model for such a collaborative edge–cloud setup results in accuracy gains of up to 8 percent point, compared to an uncalibrated edge model. Additionally, EdgeBoost, when used as an abstaining classifier, can improve accuracy by up to 9 percent points over an uncalibrated model. Finally, EdgeBoost outperforms the Early Exit and Entropy thresholding baselines and achieves comparable accuracy to state-of-the-art routing-based methods without the need for hosting the router on the edge.

查看原文本刊更多论文

EdgeBoost：通过选择性卸载增强资源约束推理的信心

由于资源有限，在边缘设备上部署具有最先进精度的大型深度神经网络通常是不切实际的。本文介绍了EdgeBoost，一种选择性输入卸载系统，旨在克服边缘设备有限计算资源的挑战。EdgeBoost训练和校准轻量级模型，以便在边缘部署，此外，还可以在云上部署大型复杂模型。在推理过程中，边缘模型对输入样本进行初步预测，如果预测置信度较低，则将样本送到云模型进行进一步处理，否则接受局部预测。通过仔细校准，EdgeBoost与纯云解决方案相比，CIFAR-100、ImageNet-1k和Stanford Cars数据集的通信成本分别降低了55%、27%和20%，同时实现了同等的分类精度。此外，与纯云解决方案相比，EdgeBoost将每次推理的总延迟从148 ms减少到123.84 ms。我们的评估还表明，与未校准的边缘模型相比，为这种协作边缘云设置校准边缘模型可获得高达8%的精度增益。此外，当EdgeBoost用作弃权分类器时，可以比未校准的模型提高高达9%的准确性。最后，EdgeBoost优于早期退出和熵阈值基线，达到了与最先进的基于路由的方法相当的精度，而无需在边缘托管路由器。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Computer Networks 工程技术-电信学

CiteScore

10.80

自引率

3.60%

发文量

434

审稿时长

8.6 months

期刊介绍： Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.