Minxian Xu, Lei Yang, Yang Wang, Chengxi Gao, Linfeng Wen, Guoyao Xu, Liping Zhang, Kejiang Ye, Chengzhong Xu
{"title":"阿里云在大规模微服务集群弹性资源配置上的实践","authors":"Minxian Xu, Lei Yang, Yang Wang, Chengxi Gao, Linfeng Wen, Guoyao Xu, Liping Zhang, Kejiang Ye, Chengzhong Xu","doi":"10.1002/spe.3271","DOIUrl":null,"url":null,"abstract":"Summary Cloud‐native architecture is becoming increasingly crucial for today's cloud computing environments due to the need for speed and flexibility in developing applications. It utilizes microservice technology to break down traditional monolithic applications into light‐weight and self‐contained microservice components. However, as microservices grow in scale and have dynamic inter‐dependencies, they also pose new challenges in resource provisioning that cannot be fully addressed by traditional resource scheduling approaches. The various microservices with different resource demands and latency requirements can create complex calling chains, making it difficult to provide fine‐grained and accurate resource allocation to each component while maintaining the overall quality of service in the chain. Alibaba Cloud has fully embraced cloud‐native and microservice technologies to drive its key business and scenarios, including Double 11 Shopping Festival. In this work, we aim to address the research problem on how to efficiently provision resources for the growing scale of microservice platform and ensure the performance of latency‐critical microservices. To address the problem, we present in‐depth analyses of Alibaba's microservice cluster and propose optimized resource provisioning algorithms to enhance resource utilization while ensuring the latency requirement. First, we analyze the distinct features of microservices in Alibaba's cluster compared to traditional applications. Then we present Alibaba's resource capacity provisioning workflow and framework to address challenges in resource provisioning for large‐scale and latency‐critical microservice clusters. Finally, we propose enhanced resource provisioning algorithms over Alibaba's current practice by making both proactive and reactive scheduling decisions based on different workloads patterns, which can improve resource usage by 10%–15% in Alibaba's clusters, while maintaining the necessary latency for microservices.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Practice of Alibaba cloud on elastic resource provisioning for large‐scale microservices cluster\",\"authors\":\"Minxian Xu, Lei Yang, Yang Wang, Chengxi Gao, Linfeng Wen, Guoyao Xu, Liping Zhang, Kejiang Ye, Chengzhong Xu\",\"doi\":\"10.1002/spe.3271\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary Cloud‐native architecture is becoming increasingly crucial for today's cloud computing environments due to the need for speed and flexibility in developing applications. It utilizes microservice technology to break down traditional monolithic applications into light‐weight and self‐contained microservice components. However, as microservices grow in scale and have dynamic inter‐dependencies, they also pose new challenges in resource provisioning that cannot be fully addressed by traditional resource scheduling approaches. The various microservices with different resource demands and latency requirements can create complex calling chains, making it difficult to provide fine‐grained and accurate resource allocation to each component while maintaining the overall quality of service in the chain. Alibaba Cloud has fully embraced cloud‐native and microservice technologies to drive its key business and scenarios, including Double 11 Shopping Festival. In this work, we aim to address the research problem on how to efficiently provision resources for the growing scale of microservice platform and ensure the performance of latency‐critical microservices. To address the problem, we present in‐depth analyses of Alibaba's microservice cluster and propose optimized resource provisioning algorithms to enhance resource utilization while ensuring the latency requirement. First, we analyze the distinct features of microservices in Alibaba's cluster compared to traditional applications. Then we present Alibaba's resource capacity provisioning workflow and framework to address challenges in resource provisioning for large‐scale and latency‐critical microservice clusters. Finally, we propose enhanced resource provisioning algorithms over Alibaba's current practice by making both proactive and reactive scheduling decisions based on different workloads patterns, which can improve resource usage by 10%–15% in Alibaba's clusters, while maintaining the necessary latency for microservices.\",\"PeriodicalId\":21899,\"journal\":{\"name\":\"Software: Practice and Experience\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Software: Practice and Experience\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/spe.3271\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software: Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/spe.3271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Practice of Alibaba cloud on elastic resource provisioning for large‐scale microservices cluster
Summary Cloud‐native architecture is becoming increasingly crucial for today's cloud computing environments due to the need for speed and flexibility in developing applications. It utilizes microservice technology to break down traditional monolithic applications into light‐weight and self‐contained microservice components. However, as microservices grow in scale and have dynamic inter‐dependencies, they also pose new challenges in resource provisioning that cannot be fully addressed by traditional resource scheduling approaches. The various microservices with different resource demands and latency requirements can create complex calling chains, making it difficult to provide fine‐grained and accurate resource allocation to each component while maintaining the overall quality of service in the chain. Alibaba Cloud has fully embraced cloud‐native and microservice technologies to drive its key business and scenarios, including Double 11 Shopping Festival. In this work, we aim to address the research problem on how to efficiently provision resources for the growing scale of microservice platform and ensure the performance of latency‐critical microservices. To address the problem, we present in‐depth analyses of Alibaba's microservice cluster and propose optimized resource provisioning algorithms to enhance resource utilization while ensuring the latency requirement. First, we analyze the distinct features of microservices in Alibaba's cluster compared to traditional applications. Then we present Alibaba's resource capacity provisioning workflow and framework to address challenges in resource provisioning for large‐scale and latency‐critical microservice clusters. Finally, we propose enhanced resource provisioning algorithms over Alibaba's current practice by making both proactive and reactive scheduling decisions based on different workloads patterns, which can improve resource usage by 10%–15% in Alibaba's clusters, while maintaining the necessary latency for microservices.