Tien Van Do;Nam H. Do;Csaba Rotter;T. V. Lakshman;Csaba Biro;Tamas Bérczes
{"title":"Properties of Horizontal Pod Autoscaling Algorithms and Application for Scaling Cloud-Native Network Functions","authors":"Tien Van Do;Nam H. Do;Csaba Rotter;T. V. Lakshman;Csaba Biro;Tamas Bérczes","doi":"10.1109/TNSM.2025.3532121","DOIUrl":null,"url":null,"abstract":"With the growing adoption of network function virtualization, telco core network elements and network functions will increasingly be designed and deployed as cloud-native application instances. To ensure the efficient use of virtualised resources and meet diverse requirements for quality of services a resource scaling algorithm is used to scale the number of application instances up or down depending on variations in offered traffic from customers. Most of the observed performance metrics for a service are a function of the current customer traffic and the current number of application instances providing the service. The ubiquitous use of Kubernetes, the popular open-source framework for deployment and management of cloud-native functions, has resulted in variants of the Kubernetes Horizontal Pod Autoscaling (HPA) algorithm being widely used to change the number of application instances providing network functions as traffic demands vary. This change is done by determining whether a selected performance metric of interest is outside a range set by two input parameters (the desired metric value and the tolerance parameter). In this paper, we investigate the characteristics of the HPA algorithms and prove that there are only a finite number of intervals for its tolerance parameter. Further any choice of the tolerance parameter from each interval leads to similar computational decisions on the recommended number of application instances. As a consequence, the number of parameter setting choices is finite due to the rule that the desired metric value can only be an integer in specific ranges. Additionally, we investigate the use of HPA for scaling application instances that provide session-based services and establish lower and the upper bounds for the performance of the HPA scaling algorithms in this scenario. Our contributions can help operators find appropriate parameter settings efficiently - administrators of Kubernetes clusters only need to select parameters from a limited and finite number of choices (instead of infinite) for scaling cloud-native applications.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 2","pages":"1889-1898"},"PeriodicalIF":4.7000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network and Service Management","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10847897/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 0
Abstract
With the growing adoption of network function virtualization, telco core network elements and network functions will increasingly be designed and deployed as cloud-native application instances. To ensure the efficient use of virtualised resources and meet diverse requirements for quality of services a resource scaling algorithm is used to scale the number of application instances up or down depending on variations in offered traffic from customers. Most of the observed performance metrics for a service are a function of the current customer traffic and the current number of application instances providing the service. The ubiquitous use of Kubernetes, the popular open-source framework for deployment and management of cloud-native functions, has resulted in variants of the Kubernetes Horizontal Pod Autoscaling (HPA) algorithm being widely used to change the number of application instances providing network functions as traffic demands vary. This change is done by determining whether a selected performance metric of interest is outside a range set by two input parameters (the desired metric value and the tolerance parameter). In this paper, we investigate the characteristics of the HPA algorithms and prove that there are only a finite number of intervals for its tolerance parameter. Further any choice of the tolerance parameter from each interval leads to similar computational decisions on the recommended number of application instances. As a consequence, the number of parameter setting choices is finite due to the rule that the desired metric value can only be an integer in specific ranges. Additionally, we investigate the use of HPA for scaling application instances that provide session-based services and establish lower and the upper bounds for the performance of the HPA scaling algorithms in this scenario. Our contributions can help operators find appropriate parameter settings efficiently - administrators of Kubernetes clusters only need to select parameters from a limited and finite number of choices (instead of infinite) for scaling cloud-native applications.
期刊介绍:
IEEE Transactions on Network and Service Management will publish (online only) peerreviewed archival quality papers that advance the state-of-the-art and practical applications of network and service management. Theoretical research contributions (presenting new concepts and techniques) and applied contributions (reporting on experiences and experiments with actual systems) will be encouraged. These transactions will focus on the key technical issues related to: Management Models, Architectures and Frameworks; Service Provisioning, Reliability and Quality Assurance; Management Functions; Enabling Technologies; Information and Communication Models; Policies; Applications and Case Studies; Emerging Technologies and Standards.