{"title":"Optimizing Serverless Performance Through Game Theory and Efficient Resource Scheduling","authors":"Pengwei Wang;Yi Li;Chao Fang;Yichen Zhong;Zhijun Ding","doi":"10.1109/TC.2025.3547158","DOIUrl":null,"url":null,"abstract":"The scaler and scheduler of serverless system are the two cornerstones that ensure service quality and efficiency. However, existing scalers and schedulers are constrained by static thresholds, scaling latency, and single-dimensional optimization, making them difficult to agilely respond to dynamic workloads of functions with different characteristics. This paper proposes a game theory-based scaler and a dual-layer optimization scheduler to enhance the resource management and task allocation capabilities of serverless systems. In the scaler, we introduce the Hawkes process to quantify the “temperature” of function as an indicator of their instantaneous invocation rate. By combining dynamic thresholds and continuous monitoring, this scaler enables that scaling operations no longer lag behind changes of function instances and can even warm up beforehand. For scheduler, we refer to bin-packing strategies to optimize the distribution of containers and reduce resource fragmentation. A new concept of “CPU starvation degree” is introduced to denote the degree of CPU contention during function execution, ensuring that function requests are efficiently scheduled. Experimental analysis on ServerlessBench and Alibaba clusterdata indicates that compared to classical and state-of-the-art scalers and schedulers, the proposed scaler and scheduler achieve at least a 149% improvement in the Quality-Price Ratio, which represents the trade-off between performance and cost.","PeriodicalId":13087,"journal":{"name":"IEEE Transactions on Computers","volume":"74 6","pages":"1990-2002"},"PeriodicalIF":3.6000,"publicationDate":"2025-03-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computers","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10908572/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0
Abstract
The scaler and scheduler of serverless system are the two cornerstones that ensure service quality and efficiency. However, existing scalers and schedulers are constrained by static thresholds, scaling latency, and single-dimensional optimization, making them difficult to agilely respond to dynamic workloads of functions with different characteristics. This paper proposes a game theory-based scaler and a dual-layer optimization scheduler to enhance the resource management and task allocation capabilities of serverless systems. In the scaler, we introduce the Hawkes process to quantify the “temperature” of function as an indicator of their instantaneous invocation rate. By combining dynamic thresholds and continuous monitoring, this scaler enables that scaling operations no longer lag behind changes of function instances and can even warm up beforehand. For scheduler, we refer to bin-packing strategies to optimize the distribution of containers and reduce resource fragmentation. A new concept of “CPU starvation degree” is introduced to denote the degree of CPU contention during function execution, ensuring that function requests are efficiently scheduled. Experimental analysis on ServerlessBench and Alibaba clusterdata indicates that compared to classical and state-of-the-art scalers and schedulers, the proposed scaler and scheduler achieve at least a 149% improvement in the Quality-Price Ratio, which represents the trade-off between performance and cost.
期刊介绍:
The IEEE Transactions on Computers is a monthly publication with a wide distribution to researchers, developers, technical managers, and educators in the computer field. It publishes papers on research in areas of current interest to the readers. These areas include, but are not limited to, the following: a) computer organizations and architectures; b) operating systems, software systems, and communication protocols; c) real-time systems and embedded systems; d) digital devices, computer components, and interconnection networks; e) specification, design, prototyping, and testing methods and tools; f) performance, fault tolerance, reliability, security, and testability; g) case studies and experimental and theoretical evaluations; and h) new and important applications and trends.