{"title":"EdgeBoost:通过选择性卸载增强资源约束推理的信心","authors":"Naina Said , Olaf Landsiedel","doi":"10.1016/j.comnet.2025.111437","DOIUrl":null,"url":null,"abstract":"<div><div>Deploying large Deep Neural Networks with state-of-the-art accuracy on edge devices is often impractical due to their limited resources. This paper introduces <span>EdgeBoost</span>, a selective input offloading system designed to overcome the challenges of limited computational resources on edge devices. <span>EdgeBoost</span> trains and calibrates a lightweight model for deployment on the edge and, in addition, deploys a large, complex model on the cloud. During inference, the edge model makes initial predictions for input samples, and if the confidence of the prediction is low, the sample is sent to the cloud model for further processing, otherwise, we accept the local prediction. Through careful calibration, <span>EdgeBoost</span> reduces the communication cost by 55%, 27% and 20% for the CIFAR-100, ImageNet-1k and Stanford Cars datasets, respectively, when compared to an cloud-only solution while achieving on-par classification accuracy. Furthermore, <span>EdgeBoost</span> reduces the total inference latency from 148 ms to 123.84 ms per inference compared to a cloud-only solution. Our evaluation also shows that calibrating the edge model for such a collaborative edge–cloud setup results in accuracy gains of up to 8 percent point, compared to an uncalibrated edge model. Additionally, EdgeBoost, when used as an abstaining classifier, can improve accuracy by up to 9 percent points over an uncalibrated model. Finally, <span>EdgeBoost</span> outperforms the Early Exit and Entropy thresholding baselines and achieves comparable accuracy to state-of-the-art routing-based methods without the need for hosting the router on the edge.</div></div>","PeriodicalId":50637,"journal":{"name":"Computer Networks","volume":"269 ","pages":"Article 111437"},"PeriodicalIF":4.4000,"publicationDate":"2025-06-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"EdgeBoost: Confidence boosting for resource constrained inference via selective offloading\",\"authors\":\"Naina Said , Olaf Landsiedel\",\"doi\":\"10.1016/j.comnet.2025.111437\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Deploying large Deep Neural Networks with state-of-the-art accuracy on edge devices is often impractical due to their limited resources. This paper introduces <span>EdgeBoost</span>, a selective input offloading system designed to overcome the challenges of limited computational resources on edge devices. <span>EdgeBoost</span> trains and calibrates a lightweight model for deployment on the edge and, in addition, deploys a large, complex model on the cloud. During inference, the edge model makes initial predictions for input samples, and if the confidence of the prediction is low, the sample is sent to the cloud model for further processing, otherwise, we accept the local prediction. Through careful calibration, <span>EdgeBoost</span> reduces the communication cost by 55%, 27% and 20% for the CIFAR-100, ImageNet-1k and Stanford Cars datasets, respectively, when compared to an cloud-only solution while achieving on-par classification accuracy. Furthermore, <span>EdgeBoost</span> reduces the total inference latency from 148 ms to 123.84 ms per inference compared to a cloud-only solution. Our evaluation also shows that calibrating the edge model for such a collaborative edge–cloud setup results in accuracy gains of up to 8 percent point, compared to an uncalibrated edge model. Additionally, EdgeBoost, when used as an abstaining classifier, can improve accuracy by up to 9 percent points over an uncalibrated model. Finally, <span>EdgeBoost</span> outperforms the Early Exit and Entropy thresholding baselines and achieves comparable accuracy to state-of-the-art routing-based methods without the need for hosting the router on the edge.</div></div>\",\"PeriodicalId\":50637,\"journal\":{\"name\":\"Computer Networks\",\"volume\":\"269 \",\"pages\":\"Article 111437\"},\"PeriodicalIF\":4.4000,\"publicationDate\":\"2025-06-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Computer Networks\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S1389128625004049\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Networks","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1389128625004049","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
EdgeBoost: Confidence boosting for resource constrained inference via selective offloading
Deploying large Deep Neural Networks with state-of-the-art accuracy on edge devices is often impractical due to their limited resources. This paper introduces EdgeBoost, a selective input offloading system designed to overcome the challenges of limited computational resources on edge devices. EdgeBoost trains and calibrates a lightweight model for deployment on the edge and, in addition, deploys a large, complex model on the cloud. During inference, the edge model makes initial predictions for input samples, and if the confidence of the prediction is low, the sample is sent to the cloud model for further processing, otherwise, we accept the local prediction. Through careful calibration, EdgeBoost reduces the communication cost by 55%, 27% and 20% for the CIFAR-100, ImageNet-1k and Stanford Cars datasets, respectively, when compared to an cloud-only solution while achieving on-par classification accuracy. Furthermore, EdgeBoost reduces the total inference latency from 148 ms to 123.84 ms per inference compared to a cloud-only solution. Our evaluation also shows that calibrating the edge model for such a collaborative edge–cloud setup results in accuracy gains of up to 8 percent point, compared to an uncalibrated edge model. Additionally, EdgeBoost, when used as an abstaining classifier, can improve accuracy by up to 9 percent points over an uncalibrated model. Finally, EdgeBoost outperforms the Early Exit and Entropy thresholding baselines and achieves comparable accuracy to state-of-the-art routing-based methods without the need for hosting the router on the edge.
期刊介绍:
Computer Networks is an international, archival journal providing a publication vehicle for complete coverage of all topics of interest to those involved in the computer communications networking area. The audience includes researchers, managers and operators of networks as well as designers and implementors. The Editorial Board will consider any material for publication that is of interest to those groups.