Shuangwei Gao, Peng Yang, Yuxin Kong, Feng Lyu, Ning Zhang
{"title":"Joint Model Assignment and Resource Allocation for Cost-Effective Mobile Generative Services","authors":"Shuangwei Gao, Peng Yang, Yuxin Kong, Feng Lyu, Ning Zhang","doi":"arxiv-2409.09072","DOIUrl":null,"url":null,"abstract":"Artificial Intelligence Generated Content (AIGC) services can efficiently\nsatisfy user-specified content creation demands, but the high computational\nrequirements pose various challenges to supporting mobile users at scale. In\nthis paper, we present our design of an edge-enabled AIGC service provisioning\nsystem to properly assign computing tasks of generative models to edge servers,\nthereby improving overall user experience and reducing content generation\nlatency. Specifically, once the edge server receives user requested task\nprompts, it dynamically assigns appropriate models and allocates computing\nresources based on features of each category of prompts. The generated contents\nare then delivered to users. The key to this system is a proposed probabilistic\nmodel assignment approach, which estimates the quality score of generated\ncontents for each prompt based on category labels. Next, we introduce a\nheuristic algorithm that enables adaptive configuration of both generation\nsteps and resource allocation, according to the various task requests received\nby each generative model on the edge.Simulation results demonstrate that the\ndesigned system can effectively enhance the quality of generated content by up\nto 4.7% while reducing response delay by up to 39.1% compared to benchmarks.","PeriodicalId":501422,"journal":{"name":"arXiv - CS - Distributed, Parallel, and Cluster Computing","volume":"24 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Distributed, Parallel, and Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.09072","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Artificial Intelligence Generated Content (AIGC) services can efficiently
satisfy user-specified content creation demands, but the high computational
requirements pose various challenges to supporting mobile users at scale. In
this paper, we present our design of an edge-enabled AIGC service provisioning
system to properly assign computing tasks of generative models to edge servers,
thereby improving overall user experience and reducing content generation
latency. Specifically, once the edge server receives user requested task
prompts, it dynamically assigns appropriate models and allocates computing
resources based on features of each category of prompts. The generated contents
are then delivered to users. The key to this system is a proposed probabilistic
model assignment approach, which estimates the quality score of generated
contents for each prompt based on category labels. Next, we introduce a
heuristic algorithm that enables adaptive configuration of both generation
steps and resource allocation, according to the various task requests received
by each generative model on the edge.Simulation results demonstrate that the
designed system can effectively enhance the quality of generated content by up
to 4.7% while reducing response delay by up to 39.1% compared to benchmarks.