{"title":"通过物联网设备边缘协作实现高能效、高精确度的 DNN 推断","authors":"Wei Jiang;Haichao Han;Daquan Feng;Liping Qian;Qian Wang;Xiang-Gen Xia","doi":"10.1109/TSC.2025.3536311","DOIUrl":null,"url":null,"abstract":"Due to the limited energy and computing resources of Internet of Things (IoT) devices, the collaboration of IoT devices and edge servers is considered to handle the complex deep neural network (DNN) inference tasks. However, the heterogeneity of IoT devices and the various accuracy requirements of inference tasks make it difficult to deploy all the DNN models in edge servers. Moreover, a large-scale data transmission is engaged in collaborative inference, resulting in an increased demand on spectrum resource and energy consumption. To address these issues, in this paper, we first design an accuracy-aware multi-branch DNN inference model and quantify the relationship between branch selection and inference accuracy. Then, based on the multi-branch DNN model, we aim to minimize the energy consumption of devices by jointly optimizing the selection of DNN branches and partition layers, as well as the computing and communication resources allocation. The proposed problem is a mixed-integer nonlinear programming problem. We propose a hierarchical approach to decompose the problem, and then solve it with a proportional integral derivative based searching algorithm. Experimental results demonstrate our proposed scheme has better inference performance and can reduce the total energy consumption up to 65.3<inline-formula><tex-math>$\\%$</tex-math></inline-formula>, compared to other collaboration schemes.","PeriodicalId":13255,"journal":{"name":"IEEE Transactions on Services Computing","volume":"18 2","pages":"784-797"},"PeriodicalIF":5.5000,"publicationDate":"2025-01-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Energy-Efficient and Accuracy-Aware DNN Inference With IoT Device-Edge Collaboration\",\"authors\":\"Wei Jiang;Haichao Han;Daquan Feng;Liping Qian;Qian Wang;Xiang-Gen Xia\",\"doi\":\"10.1109/TSC.2025.3536311\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Due to the limited energy and computing resources of Internet of Things (IoT) devices, the collaboration of IoT devices and edge servers is considered to handle the complex deep neural network (DNN) inference tasks. However, the heterogeneity of IoT devices and the various accuracy requirements of inference tasks make it difficult to deploy all the DNN models in edge servers. Moreover, a large-scale data transmission is engaged in collaborative inference, resulting in an increased demand on spectrum resource and energy consumption. To address these issues, in this paper, we first design an accuracy-aware multi-branch DNN inference model and quantify the relationship between branch selection and inference accuracy. Then, based on the multi-branch DNN model, we aim to minimize the energy consumption of devices by jointly optimizing the selection of DNN branches and partition layers, as well as the computing and communication resources allocation. The proposed problem is a mixed-integer nonlinear programming problem. We propose a hierarchical approach to decompose the problem, and then solve it with a proportional integral derivative based searching algorithm. Experimental results demonstrate our proposed scheme has better inference performance and can reduce the total energy consumption up to 65.3<inline-formula><tex-math>$\\\\%$</tex-math></inline-formula>, compared to other collaboration schemes.\",\"PeriodicalId\":13255,\"journal\":{\"name\":\"IEEE Transactions on Services Computing\",\"volume\":\"18 2\",\"pages\":\"784-797\"},\"PeriodicalIF\":5.5000,\"publicationDate\":\"2025-01-30\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Services Computing\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10858448/\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Services Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10858448/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Energy-Efficient and Accuracy-Aware DNN Inference With IoT Device-Edge Collaboration
Due to the limited energy and computing resources of Internet of Things (IoT) devices, the collaboration of IoT devices and edge servers is considered to handle the complex deep neural network (DNN) inference tasks. However, the heterogeneity of IoT devices and the various accuracy requirements of inference tasks make it difficult to deploy all the DNN models in edge servers. Moreover, a large-scale data transmission is engaged in collaborative inference, resulting in an increased demand on spectrum resource and energy consumption. To address these issues, in this paper, we first design an accuracy-aware multi-branch DNN inference model and quantify the relationship between branch selection and inference accuracy. Then, based on the multi-branch DNN model, we aim to minimize the energy consumption of devices by jointly optimizing the selection of DNN branches and partition layers, as well as the computing and communication resources allocation. The proposed problem is a mixed-integer nonlinear programming problem. We propose a hierarchical approach to decompose the problem, and then solve it with a proportional integral derivative based searching algorithm. Experimental results demonstrate our proposed scheme has better inference performance and can reduce the total energy consumption up to 65.3$\%$, compared to other collaboration schemes.
期刊介绍:
IEEE Transactions on Services Computing encompasses the computing and software aspects of the science and technology of services innovation research and development. It places emphasis on algorithmic, mathematical, statistical, and computational methods central to services computing. Topics covered include Service Oriented Architecture, Web Services, Business Process Integration, Solution Performance Management, and Services Operations and Management. The transactions address mathematical foundations, security, privacy, agreement, contract, discovery, negotiation, collaboration, and quality of service for web services. It also covers areas like composite web service creation, business and scientific applications, standards, utility models, business process modeling, integration, collaboration, and more in the realm of Services Computing.