Online Model Retraining, Compression, and Instance Allocation in Edge Computing Networks

IF 7.9 2区计算机科学 Q1 ENGINEERING, MULTIDISCIPLINARY

IEEE Transactions on Network Science and Engineering Pub Date : 2026-02-17 DOI:10.1109/TNSE.2026.3665761

Shijia Huang;Fan Yang;Qian Ma;Shimin Gong

{"title":"Online Model Retraining, Compression, and Instance Allocation in Edge Computing Networks","authors":"Shijia Huang;Fan Yang;Qian Ma;Shimin Gong","doi":"10.1109/TNSE.2026.3665761","DOIUrl":null,"url":null,"abstract":"The adoption of artificial intelligence models (e.g., DNN models) in Internet of Things has boosted computing demands in edge computing. Frequent model retraining, necessitated by concept drift, further increases resource usage, while model compression sacrifices model performance for computing efficiency. However, few works study computing instance allocation problem considering dynamic model retraining and compression, especially under varying workloads and model performance degradation. In this work, we model the problem as a joint online model retraining, compression, and instance allocation problem in edge computing networks considering model performance and instance cost. Solving the online problem is challenging since it is a non-linear binary programming problem with time-coupling instance switching cost. We first solve the online problem under fixed compression and propose an efficient online algorithm. Specifically, we first linearize the non-linear term, then regularize the time-coupling switching cost to decouple the problem, and finally use a randomization rounding method to derive the integral solution. We prove that our algorithm achieves a constant optimality gap. We then solve the online problem under flexible compression and propose a lightweight online algorithm. We extend the linearization method and decouple the problem into each time slot, and demonstrate our algorithm achieves an optimality gap depending on the time period. Simulations demonstrate that our algorithm can achieve a balance between instance cost and model performance in both fixed compression and flexible compression scenarios.","PeriodicalId":54229,"journal":{"name":"IEEE Transactions on Network Science and Engineering","volume":"13 ","pages":"7156-7172"},"PeriodicalIF":7.9000,"publicationDate":"2026-02-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network Science and Engineering","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11397553/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

Abstract

The adoption of artificial intelligence models (e.g., DNN models) in Internet of Things has boosted computing demands in edge computing. Frequent model retraining, necessitated by concept drift, further increases resource usage, while model compression sacrifices model performance for computing efficiency. However, few works study computing instance allocation problem considering dynamic model retraining and compression, especially under varying workloads and model performance degradation. In this work, we model the problem as a joint online model retraining, compression, and instance allocation problem in edge computing networks considering model performance and instance cost. Solving the online problem is challenging since it is a non-linear binary programming problem with time-coupling instance switching cost. We first solve the online problem under fixed compression and propose an efficient online algorithm. Specifically, we first linearize the non-linear term, then regularize the time-coupling switching cost to decouple the problem, and finally use a randomization rounding method to derive the integral solution. We prove that our algorithm achieves a constant optimality gap. We then solve the online problem under flexible compression and propose a lightweight online algorithm. We extend the linearization method and decouple the problem into each time slot, and demonstrate our algorithm achieves an optimality gap depending on the time period. Simulations demonstrate that our algorithm can achieve a balance between instance cost and model performance in both fixed compression and flexible compression scenarios.

查看原文本刊更多论文

边缘计算网络中的在线模型再训练、压缩和实例分配

物联网中人工智能模型（如DNN模型）的采用推动了边缘计算的计算需求。由于概念漂移的需要，频繁的模型再训练进一步增加了资源的使用，而模型压缩为了计算效率而牺牲了模型的性能。然而，很少有研究考虑动态模型再训练和压缩的计算实例分配问题，特别是在工作负载变化和模型性能下降的情况下。在这项工作中，我们将该问题建模为边缘计算网络中考虑模型性能和实例成本的联合在线模型再训练、压缩和实例分配问题。在线问题是一个具有时间耦合实例切换代价的非线性二元规划问题，求解该问题具有一定的挑战性。首先解决了固定压缩下的在线问题，提出了一种高效的在线算法。首先对非线性项进行线性化，然后对时间耦合切换代价进行正则化以解耦问题，最后采用随机化四舍五入方法推导出积分解。我们证明了我们的算法实现了一个恒定的最优性差距。然后解决了柔性压缩下的在线问题，提出了一种轻量级的在线算法。我们扩展了线性化方法，并将问题解耦到每个时隙中，并证明了我们的算法可以根据时间段实现最优性间隙。仿真结果表明，该算法在固定压缩和灵活压缩两种场景下都能达到实例成本和模型性能的平衡。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Network Science and Engineering Engineering-Control and Systems Engineering

CiteScore

12.60

自引率

9.10%

发文量

393

期刊介绍： The proposed journal, called the IEEE Transactions on Network Science and Engineering (TNSE), is committed to timely publishing of peer-reviewed technical articles that deal with the theory and applications of network science and the interconnections among the elements in a system that form a network. In particular, the IEEE Transactions on Network Science and Engineering publishes articles on understanding, prediction, and control of structures and behaviors of networks at the fundamental level. The types of networks covered include physical or engineered networks, information networks, biological networks, semantic networks, economic networks, social networks, and ecological networks. Aimed at discovering common principles that govern network structures, network functionalities and behaviors of networks, the journal seeks articles on understanding, prediction, and control of structures and behaviors of networks. Another trans-disciplinary focus of the IEEE Transactions on Network Science and Engineering is the interactions between and co-evolution of different genres of networks.