Energy-Efficient and Accuracy-Aware DNN Inference With IoT Device-Edge Collaboration

IF 5.5 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Services Computing Pub Date : 2025-01-30 DOI:10.1109/TSC.2025.3536311

Wei Jiang;Haichao Han;Daquan Feng;Liping Qian;Qian Wang;Xiang-Gen Xia

{"title":"Energy-Efficient and Accuracy-Aware DNN Inference With IoT Device-Edge Collaboration","authors":"Wei Jiang;Haichao Han;Daquan Feng;Liping Qian;Qian Wang;Xiang-Gen Xia","doi":"10.1109/TSC.2025.3536311","DOIUrl":null,"url":null,"abstract":"Due to the limited energy and computing resources of Internet of Things (IoT) devices, the collaboration of IoT devices and edge servers is considered to handle the complex deep neural network (DNN) inference tasks. However, the heterogeneity of IoT devices and the various accuracy requirements of inference tasks make it difficult to deploy all the DNN models in edge servers. Moreover, a large-scale data transmission is engaged in collaborative inference, resulting in an increased demand on spectrum resource and energy consumption. To address these issues, in this paper, we first design an accuracy-aware multi-branch DNN inference model and quantify the relationship between branch selection and inference accuracy. Then, based on the multi-branch DNN model, we aim to minimize the energy consumption of devices by jointly optimizing the selection of DNN branches and partition layers, as well as the computing and communication resources allocation. The proposed problem is a mixed-integer nonlinear programming problem. We propose a hierarchical approach to decompose the problem, and then solve it with a proportional integral derivative based searching algorithm. Experimental results demonstrate our proposed scheme has better inference performance and can reduce the total energy consumption up to 65.3<inline-formula><tex-math>$\\%$</tex-math></inline-formula>, compared to other collaboration schemes.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"784-797"},"PeriodicalIF":5.5000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Services Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10858448/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

Due to the limited energy and computing resources of Internet of Things (IoT) devices, the collaboration of IoT devices and edge servers is considered to handle the complex deep neural network (DNN) inference tasks. However, the heterogeneity of IoT devices and the various accuracy requirements of inference tasks make it difficult to deploy all the DNN models in edge servers. Moreover, a large-scale data transmission is engaged in collaborative inference, resulting in an increased demand on spectrum resource and energy consumption. To address these issues, in this paper, we first design an accuracy-aware multi-branch DNN inference model and quantify the relationship between branch selection and inference accuracy. Then, based on the multi-branch DNN model, we aim to minimize the energy consumption of devices by jointly optimizing the selection of DNN branches and partition layers, as well as the computing and communication resources allocation. The proposed problem is a mixed-integer nonlinear programming problem. We propose a hierarchical approach to decompose the problem, and then solve it with a proportional integral derivative based searching algorithm. Experimental results demonstrate our proposed scheme has better inference performance and can reduce the total energy consumption up to 65.3

$\%$

, compared to other collaboration schemes.

查看原文本刊更多论文

通过物联网设备边缘协作实现高能效、高精确度的 DNN 推断

由于物联网设备的能量和计算资源有限，为了处理复杂的深度神经网络（DNN）推理任务，需要考虑物联网设备与边缘服务器的协同。然而，物联网设备的异构性和推理任务的各种精度要求使得在边缘服务器中部署所有DNN模型变得困难。此外，大规模数据传输涉及协同推理，对频谱资源和能耗的需求增加。为了解决这些问题，本文首先设计了一个精度感知的多分支DNN推理模型，并量化了分支选择与推理精度之间的关系。然后，在多分支深度神经网络模型的基础上，通过共同优化DNN分支和分区层的选择以及计算和通信资源的分配，实现设备能耗最小化。所提出的问题是一个混合整数非线性规划问题。我们提出了一种分层分解问题的方法，然后用基于比例积分导数的搜索算法求解。实验结果表明，与其他协作方案相比，该方案具有更好的推理性能，可将总能耗降低65.3美元。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Services Computing COMPUTER SCIENCE, INFORMATION SYSTEMS-COMPUTER SCIENCE, SOFTWARE ENGINEERING

CiteScore

11.50

自引率

6.20%

发文量

278

审稿时长

>12 weeks

期刊介绍： IEEE Transactions on Services Computing encompasses the computing and software aspects of the science and technology of services innovation research and development. It places emphasis on algorithmic, mathematical, statistical, and computational methods central to services computing. Topics covered include Service Oriented Architecture, Web Services, Business Process Integration, Solution Performance Management, and Services Operations and Management. The transactions address mathematical foundations, security, privacy, agreement, contract, discovery, negotiation, collaboration, and quality of service for web services. It also covers areas like composite web service creation, business and scientific applications, standards, utility models, business process modeling, integration, collaboration, and more in the realm of Services Computing.