{"title":"COPA: A Combined Autoscaling Method for Kubernetes","authors":"Zhijun Ding, Qichen Huang","doi":"10.1109/ICWS53863.2021.00061","DOIUrl":null,"url":null,"abstract":"Autoscaling is one of the major features of Cloud Computing aiming to improve the Quality-of-Service(QoS) in response to fluctuating workloads. Existing state-of-the-art autoscaling methods for Kubernetes focus on single scaling mode, that is, only horizontal scaling and only vertical scaling. For horizontal scaling, a high resource usage rate cannot be guaranteed sometimes; and for vertical scaling, microservice instances appear a performance ceiling that does not grow indefinitely as the supply of resources increases. In this paper, we propose a novel combined scaling method called COPA. Based on the collected microservice performance data, real-time workload, expected response time, and microservice instances scheme at runtime, COPA uses the queuing network model to calculate a combined scaling scheme that aims to minimize the default cost and resource cost. We evaluated our approach in a Kubernetes cluster, and compare it with existing state-of-the-art autoscaling methods under four different workload types. Such experiments show a reduction of ×1.22 for resource cost while ensuring the QoS as compared to the baseline method.","PeriodicalId":213320,"journal":{"name":"2021 IEEE International Conference on Web Services (ICWS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Conference on Web Services (ICWS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICWS53863.2021.00061","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
Autoscaling is one of the major features of Cloud Computing aiming to improve the Quality-of-Service(QoS) in response to fluctuating workloads. Existing state-of-the-art autoscaling methods for Kubernetes focus on single scaling mode, that is, only horizontal scaling and only vertical scaling. For horizontal scaling, a high resource usage rate cannot be guaranteed sometimes; and for vertical scaling, microservice instances appear a performance ceiling that does not grow indefinitely as the supply of resources increases. In this paper, we propose a novel combined scaling method called COPA. Based on the collected microservice performance data, real-time workload, expected response time, and microservice instances scheme at runtime, COPA uses the queuing network model to calculate a combined scaling scheme that aims to minimize the default cost and resource cost. We evaluated our approach in a Kubernetes cluster, and compare it with existing state-of-the-art autoscaling methods under four different workload types. Such experiments show a reduction of ×1.22 for resource cost while ensuring the QoS as compared to the baseline method.