A Lightweight Cloud-Edge Collaborative Intelligence Inference Framework With Runtime Dynamic Optimization for Resource-Constrained Consumer Electronics
IF 10.9 2区 计算机科学Q1 ENGINEERING, ELECTRICAL & ELECTRONIC
{"title":"A Lightweight Cloud-Edge Collaborative Intelligence Inference Framework With Runtime Dynamic Optimization for Resource-Constrained Consumer Electronics","authors":"Chenlu Wang;Yuhuai Peng;Dawei Zhang;Ryan Alturki;Bandar Alshawi;Majid Alotaibi","doi":"10.1109/TCE.2025.3564777","DOIUrl":null,"url":null,"abstract":"The proliferation of Internet of Things (IoT) and embedded computing has led to widespread deployment of smart consumer electronics requiring edge-based Artificial Intelligence (AI) capabilities. However, the heterogeneous nature of sensing data and dynamic edge environments poses significant challenges for efficient model inference on resource-constrained devices. To address these challenges, this paper presents a lightweight collaborative inference framework designed for consumer electronics. First, we formulate the inference optimization problem as a Mixed-Integer Nonlinear Programming (MINLP) problem, considering channel pruning, early exit and cloud offloading decisions to optimize the trade-off between accuracy and computational cost. Second, we propose a selective model activation mechanism based on Markov Decision Process (MDP), which employs a recursive self-attention mechanism to dynamically track inference budgets and guide decision-making through encoder-decoder architectures. The mechanism integrates entropy regularization during training to ensure robust and diverse execution paths. Comprehensive experiments demonstrate that our framework achieves 65.50% reduction in model parameters and 80.68% reduction in inference Floating Point Operations (FLOPs) while maintaining accuracy loss within 0.81% of the original model, making it suitable for real-time AI applications on resource-constrained consumer electronics.","PeriodicalId":13208,"journal":{"name":"IEEE Transactions on Consumer Electronics","volume":"71 2","pages":"6041-6054"},"PeriodicalIF":10.9000,"publicationDate":"2025-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Consumer Electronics","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10978062/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0
Abstract
The proliferation of Internet of Things (IoT) and embedded computing has led to widespread deployment of smart consumer electronics requiring edge-based Artificial Intelligence (AI) capabilities. However, the heterogeneous nature of sensing data and dynamic edge environments poses significant challenges for efficient model inference on resource-constrained devices. To address these challenges, this paper presents a lightweight collaborative inference framework designed for consumer electronics. First, we formulate the inference optimization problem as a Mixed-Integer Nonlinear Programming (MINLP) problem, considering channel pruning, early exit and cloud offloading decisions to optimize the trade-off between accuracy and computational cost. Second, we propose a selective model activation mechanism based on Markov Decision Process (MDP), which employs a recursive self-attention mechanism to dynamically track inference budgets and guide decision-making through encoder-decoder architectures. The mechanism integrates entropy regularization during training to ensure robust and diverse execution paths. Comprehensive experiments demonstrate that our framework achieves 65.50% reduction in model parameters and 80.68% reduction in inference Floating Point Operations (FLOPs) while maintaining accuracy loss within 0.81% of the original model, making it suitable for real-time AI applications on resource-constrained consumer electronics.
期刊介绍:
The main focus for the IEEE Transactions on Consumer Electronics is the engineering and research aspects of the theory, design, construction, manufacture or end use of mass market electronics, systems, software and services for consumers.