CrowdHMTware: A Cross-Level Co-Adaptation Middleware for Context-Aware Mobile DL Deployment

IF 9.2 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Mobile Computing Pub Date : 2025-03-27 DOI:10.1109/TMC.2025.3549399

Sicong Liu;Bin Guo;Shiyan Luo;Yuzhan Wang;Hao Luo;Cheng Fang;Yuan Xu;Ke Ma;Yao Li;Zhiwen Yu

{"title":"CrowdHMTware: A Cross-Level Co-Adaptation Middleware for Context-Aware Mobile DL Deployment","authors":"Sicong Liu;Bin Guo;Shiyan Luo;Yuzhan Wang;Hao Luo;Cheng Fang;Yuan Xu;Ke Ma;Yao Li;Zhiwen Yu","doi":"10.1109/TMC.2025.3549399","DOIUrl":null,"url":null,"abstract":"There are many deep learning (DL) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DL models are often deployed locally on resource-constrained mobile devices using techniques such as model compression or offloading. However, existing methods, either front-end algorithm level (i.e. DL model compression/partitioning) or back-end scheduling level (i.e. operator/resource scheduling), cannot be locally online because they require offline retraining to ensure accuracy or rely on manually pre-defined strategies, struggle with dynamic adaptability. The primary challenge lies in feeding back runtime performance from the back-end level to the front-end level optimization decision. Moreover, the adaptive mobile DL model porting middleware with cross-level co-adaptation is less explored, particularly in mobile environments with diversity and dynamics. In response, we introduce CrowdHMTware, a dynamic context-adaptive DL model deployment middleware for heterogeneous mobile devices. It establishes an automated adaptation loop between cross-level functional components, i.e. elastic inference, scalable offloading, and model-adaptive engine, enhancing scalability and adaptability. Experiments with four typical tasks across 15 platforms and a real-world case study demonstrate that <inline-formula><tex-math>${\\sf CrowdHMTware}$</tex-math></inline-formula> can effectively scale DL model, offloading, and engine actions across diverse platforms and tasks. It hides run-time system issues from developers, reducing the required developer expertise.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 8","pages":"7615-7631"},"PeriodicalIF":9.2000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10944517/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

There are many deep learning (DL) powered mobile and wearable applications today continuously and unobtrusively sensing the ambient surroundings to enhance all aspects of human lives. To enable robust and private mobile sensing, DL models are often deployed locally on resource-constrained mobile devices using techniques such as model compression or offloading. However, existing methods, either front-end algorithm level (i.e. DL model compression/partitioning) or back-end scheduling level (i.e. operator/resource scheduling), cannot be locally online because they require offline retraining to ensure accuracy or rely on manually pre-defined strategies, struggle with dynamic adaptability. The primary challenge lies in feeding back runtime performance from the back-end level to the front-end level optimization decision. Moreover, the adaptive mobile DL model porting middleware with cross-level co-adaptation is less explored, particularly in mobile environments with diversity and dynamics. In response, we introduce CrowdHMTware, a dynamic context-adaptive DL model deployment middleware for heterogeneous mobile devices. It establishes an automated adaptation loop between cross-level functional components, i.e. elastic inference, scalable offloading, and model-adaptive engine, enhancing scalability and adaptability. Experiments with four typical tasks across 15 platforms and a real-world case study demonstrate that

${\sf CrowdHMTware}$

can effectively scale DL model, offloading, and engine actions across diverse platforms and tasks. It hides run-time system issues from developers, reducing the required developer expertise.

查看原文本刊更多论文

crowdhmware：用于上下文感知移动DL部署的跨级别协同自适应中间件

如今，有许多基于深度学习（DL）的移动和可穿戴应用程序不断地、不显眼地感知周围环境，以改善人类生活的各个方面。为了实现健壮和私有的移动传感，深度学习模型通常使用模型压缩或卸载等技术部署在资源受限的移动设备上。然而，现有的方法，无论是前端算法级别（即DL模型压缩/分区）还是后端调度级别（即操作员/资源调度），都不能本地在线，因为它们需要离线再训练以确保准确性，或者依赖于手动预定义的策略，与动态适应性相冲突。主要的挑战在于将运行时性能从后端级反馈到前端级优化决策。此外，对于具有跨层共适应的自适应移动深度学习模型移植中间件的研究较少，特别是在具有多样性和动态性的移动环境中。作为回应，我们引入了crowdhmware，这是一种用于异构移动设备的动态上下文自适应DL模型部署中间件。在弹性推理、可扩展卸载、模型自适应引擎等跨层功能组件之间建立了自动自适应回路，增强了可扩展性和适应性。跨15个平台的四个典型任务的实验和一个现实世界的案例研究表明，${\sf CrowdHMTware}$可以有效地扩展DL模型，卸载和跨不同平台和任务的引擎操作。它向开发人员隐藏了运行时系统问题，减少了所需的开发人员专业知识。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Mobile Computing 工程技术-电信学

CiteScore

12.90

自引率

2.50%

发文量

403

审稿时长

6.6 months

期刊介绍： IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.