Properties of Horizontal Pod Autoscaling Algorithms and Application for Scaling Cloud-Native Network Functions

IF 4.7 2区计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS

IEEE Transactions on Network and Service Management Pub Date : 2025-01-20 DOI:10.1109/TNSM.2025.3532121

Tien Van Do;Nam H. Do;Csaba Rotter;T. V. Lakshman;Csaba Biro;Tamas Bérczes

{"title":"Properties of Horizontal Pod Autoscaling Algorithms and Application for Scaling Cloud-Native Network Functions","authors":"Tien Van Do;Nam H. Do;Csaba Rotter;T. V. Lakshman;Csaba Biro;Tamas Bérczes","doi":"10.1109/TNSM.2025.3532121","DOIUrl":null,"url":null,"abstract":"With the growing adoption of network function virtualization, telco core network elements and network functions will increasingly be designed and deployed as cloud-native application instances. To ensure the efficient use of virtualised resources and meet diverse requirements for quality of services a resource scaling algorithm is used to scale the number of application instances up or down depending on variations in offered traffic from customers. Most of the observed performance metrics for a service are a function of the current customer traffic and the current number of application instances providing the service. The ubiquitous use of Kubernetes, the popular open-source framework for deployment and management of cloud-native functions, has resulted in variants of the Kubernetes Horizontal Pod Autoscaling (HPA) algorithm being widely used to change the number of application instances providing network functions as traffic demands vary. This change is done by determining whether a selected performance metric of interest is outside a range set by two input parameters (the desired metric value and the tolerance parameter). In this paper, we investigate the characteristics of the HPA algorithms and prove that there are only a finite number of intervals for its tolerance parameter. Further any choice of the tolerance parameter from each interval leads to similar computational decisions on the recommended number of application instances. As a consequence, the number of parameter setting choices is finite due to the rule that the desired metric value can only be an integer in specific ranges. Additionally, we investigate the use of HPA for scaling application instances that provide session-based services and establish lower and the upper bounds for the performance of the HPA scaling algorithms in this scenario. Our contributions can help operators find appropriate parameter settings efficiently - administrators of Kubernetes clusters only need to select parameters from a limited and finite number of choices (instead of infinite) for scaling cloud-native applications.","PeriodicalId":13423,"journal":{"name":"IEEE Transactions on Network and Service Management","volume":"22 2","pages":"1889-1898"},"PeriodicalIF":4.7000,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Network and Service Management","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10847897/","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 0

Abstract

With the growing adoption of network function virtualization, telco core network elements and network functions will increasingly be designed and deployed as cloud-native application instances. To ensure the efficient use of virtualised resources and meet diverse requirements for quality of services a resource scaling algorithm is used to scale the number of application instances up or down depending on variations in offered traffic from customers. Most of the observed performance metrics for a service are a function of the current customer traffic and the current number of application instances providing the service. The ubiquitous use of Kubernetes, the popular open-source framework for deployment and management of cloud-native functions, has resulted in variants of the Kubernetes Horizontal Pod Autoscaling (HPA) algorithm being widely used to change the number of application instances providing network functions as traffic demands vary. This change is done by determining whether a selected performance metric of interest is outside a range set by two input parameters (the desired metric value and the tolerance parameter). In this paper, we investigate the characteristics of the HPA algorithms and prove that there are only a finite number of intervals for its tolerance parameter. Further any choice of the tolerance parameter from each interval leads to similar computational decisions on the recommended number of application instances. As a consequence, the number of parameter setting choices is finite due to the rule that the desired metric value can only be an integer in specific ranges. Additionally, we investigate the use of HPA for scaling application instances that provide session-based services and establish lower and the upper bounds for the performance of the HPA scaling algorithms in this scenario. Our contributions can help operators find appropriate parameter settings efficiently - administrators of Kubernetes clusters only need to select parameters from a limited and finite number of choices (instead of infinite) for scaling cloud-native applications.

查看原文本刊更多论文

水平 Pod 自动伸缩算法的特性及在云原生网络功能伸缩中的应用

随着网络功能虚拟化的日益普及，电信核心网元和网络功能将越来越多地作为云原生应用实例进行设计和部署。为了确保虚拟化资源的有效使用，并满足对服务质量的不同需求，我们使用了一种资源缩放算法来根据客户提供的流量变化来上下缩放应用程序实例的数量。大多数观察到的服务性能指标都是当前客户流量和当前提供服务的应用程序实例数量的函数。Kubernetes是用于部署和管理云原生功能的流行开源框架，它的广泛使用导致Kubernetes水平Pod自动缩放（HPA）算法的变体被广泛用于随着流量需求的变化而改变提供网络功能的应用程序实例的数量。这种更改是通过确定所选的感兴趣的性能度量是否超出两个输入参数（期望的度量值和公差参数）设置的范围来完成的。本文研究了HPA算法的特点，并证明了其容差参数只有有限个区间。此外，对每个区间的容差参数的任何选择都会导致对推荐的应用程序实例数量的类似计算决策。因此，参数设置选择的数量是有限的，因为所需的度量值只能是特定范围内的整数。此外，我们还研究了HPA用于扩展提供基于会话的服务的应用程序实例的使用，并在这种情况下建立了HPA扩展算法性能的下限和上限。我们的贡献可以帮助运营商有效地找到合适的参数设置——Kubernetes集群的管理员只需要从有限的选项中选择参数（而不是无限的）来扩展云原生应用程序。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

IEEE Transactions on Network and Service Management Computer Science-Computer Networks and Communications

CiteScore

9.30

自引率

15.10%

发文量

325

期刊介绍： IEEE Transactions on Network and Service Management will publish (online only) peerreviewed archival quality papers that advance the state-of-the-art and practical applications of network and service management. Theoretical research contributions (presenting new concepts and techniques) and applied contributions (reporting on experiences and experiments with actual systems) will be encouraged. These transactions will focus on the key technical issues related to: Management Models, Architectures and Frameworks; Service Provisioning, Reliability and Quality Assurance; Management Functions; Enabling Technologies; Information and Communication Models; Policies; Applications and Case Studies; Emerging Technologies and Standards.