{"title":"Characterizing and Scheduling of Diffusion Process for Text-to-Image Generation in Edge Networks","authors":"Shuangwei Gao;Peng Yang;Yuxin Kong;Feng Lyu;Ning Zhang","doi":"10.1109/TMC.2025.3574065","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence-Generated Content (AIGC) technology is transforming content creation by enabling diverse customized and quality services. However, the limited computing resources on mobile devices hinder the provisioning of AIGC services at scale, pose challenges in guaranteeing user-satisfied content quality requirement. To address these challenges, we first investigate the characteristics of prompt category and inference models in Text-to-Image (T2I) diffusion process. It is observed that, model size, denoising steps, and computing resource, are three deciding factors to image generation utility. Based on this insight, we first design an edge-assisted AIGC service system to efficiently process multi-user T2I generative requests, employing a multi-flow queuing model to capture multi-user dynamics and characterize the impact of diffusion scheduling on service latency. The system schedules the diffusion process of T2I generation across edge-deployed models, balancing service quality and computing resource. To maximize generation utility under resource constraints, we propose a Monte Carlo Tree Search-based diffusion scheduling algorithm embedded with adaptive computing resource allocation subroutine. This algorithm ensures that, resource allocation dynamically adapts to scheduling decisions in real time, enabling an effective trade-off between service quality and latency. Extensive experimental comparison against baseline approaches demonstrates that, the proposed system can enhance the generation utility by up to 7.3<inline-formula><tex-math>$\\%$</tex-math></inline-formula>, achieving a 2.9<inline-formula><tex-math>$\\%$</tex-math></inline-formula> improvement in quality score and a 33.3<inline-formula><tex-math>$\\%$</tex-math></inline-formula> reduction in service latency.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"24 10","pages":"11137-11150"},"PeriodicalIF":9.2000,"publicationDate":"2025-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Mobile Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/11016084/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial Intelligence-Generated Content (AIGC) technology is transforming content creation by enabling diverse customized and quality services. However, the limited computing resources on mobile devices hinder the provisioning of AIGC services at scale, pose challenges in guaranteeing user-satisfied content quality requirement. To address these challenges, we first investigate the characteristics of prompt category and inference models in Text-to-Image (T2I) diffusion process. It is observed that, model size, denoising steps, and computing resource, are three deciding factors to image generation utility. Based on this insight, we first design an edge-assisted AIGC service system to efficiently process multi-user T2I generative requests, employing a multi-flow queuing model to capture multi-user dynamics and characterize the impact of diffusion scheduling on service latency. The system schedules the diffusion process of T2I generation across edge-deployed models, balancing service quality and computing resource. To maximize generation utility under resource constraints, we propose a Monte Carlo Tree Search-based diffusion scheduling algorithm embedded with adaptive computing resource allocation subroutine. This algorithm ensures that, resource allocation dynamically adapts to scheduling decisions in real time, enabling an effective trade-off between service quality and latency. Extensive experimental comparison against baseline approaches demonstrates that, the proposed system can enhance the generation utility by up to 7.3$\%$, achieving a 2.9$\%$ improvement in quality score and a 33.3$\%$ reduction in service latency.
期刊介绍:
IEEE Transactions on Mobile Computing addresses key technical issues related to various aspects of mobile computing. This includes (a) architectures, (b) support services, (c) algorithm/protocol design and analysis, (d) mobile environments, (e) mobile communication systems, (f) applications, and (g) emerging technologies. Topics of interest span a wide range, covering aspects like mobile networks and hosts, mobility management, multimedia, operating system support, power management, online and mobile environments, security, scalability, reliability, and emerging technologies such as wearable computers, body area networks, and wireless sensor networks. The journal serves as a comprehensive platform for advancements in mobile computing research.