阿里云在大规模微服务集群弹性资源配置上的实践

Minxian Xu, Lei Yang, Yang Wang, Chengxi Gao, Linfeng Wen, Guoyao Xu, Liping Zhang, Kejiang Ye, Chengzhong Xu
{"title":"阿里云在大规模微服务集群弹性资源配置上的实践","authors":"Minxian Xu, Lei Yang, Yang Wang, Chengxi Gao, Linfeng Wen, Guoyao Xu, Liping Zhang, Kejiang Ye, Chengzhong Xu","doi":"10.1002/spe.3271","DOIUrl":null,"url":null,"abstract":"Summary Cloud‐native architecture is becoming increasingly crucial for today's cloud computing environments due to the need for speed and flexibility in developing applications. It utilizes microservice technology to break down traditional monolithic applications into light‐weight and self‐contained microservice components. However, as microservices grow in scale and have dynamic inter‐dependencies, they also pose new challenges in resource provisioning that cannot be fully addressed by traditional resource scheduling approaches. The various microservices with different resource demands and latency requirements can create complex calling chains, making it difficult to provide fine‐grained and accurate resource allocation to each component while maintaining the overall quality of service in the chain. Alibaba Cloud has fully embraced cloud‐native and microservice technologies to drive its key business and scenarios, including Double 11 Shopping Festival. In this work, we aim to address the research problem on how to efficiently provision resources for the growing scale of microservice platform and ensure the performance of latency‐critical microservices. To address the problem, we present in‐depth analyses of Alibaba's microservice cluster and propose optimized resource provisioning algorithms to enhance resource utilization while ensuring the latency requirement. First, we analyze the distinct features of microservices in Alibaba's cluster compared to traditional applications. Then we present Alibaba's resource capacity provisioning workflow and framework to address challenges in resource provisioning for large‐scale and latency‐critical microservice clusters. Finally, we propose enhanced resource provisioning algorithms over Alibaba's current practice by making both proactive and reactive scheduling decisions based on different workloads patterns, which can improve resource usage by 10%–15% in Alibaba's clusters, while maintaining the necessary latency for microservices.","PeriodicalId":21899,"journal":{"name":"Software: Practice and Experience","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2023-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Practice of Alibaba cloud on elastic resource provisioning for large‐scale microservices cluster\",\"authors\":\"Minxian Xu, Lei Yang, Yang Wang, Chengxi Gao, Linfeng Wen, Guoyao Xu, Liping Zhang, Kejiang Ye, Chengzhong Xu\",\"doi\":\"10.1002/spe.3271\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Summary Cloud‐native architecture is becoming increasingly crucial for today's cloud computing environments due to the need for speed and flexibility in developing applications. It utilizes microservice technology to break down traditional monolithic applications into light‐weight and self‐contained microservice components. However, as microservices grow in scale and have dynamic inter‐dependencies, they also pose new challenges in resource provisioning that cannot be fully addressed by traditional resource scheduling approaches. The various microservices with different resource demands and latency requirements can create complex calling chains, making it difficult to provide fine‐grained and accurate resource allocation to each component while maintaining the overall quality of service in the chain. Alibaba Cloud has fully embraced cloud‐native and microservice technologies to drive its key business and scenarios, including Double 11 Shopping Festival. In this work, we aim to address the research problem on how to efficiently provision resources for the growing scale of microservice platform and ensure the performance of latency‐critical microservices. To address the problem, we present in‐depth analyses of Alibaba's microservice cluster and propose optimized resource provisioning algorithms to enhance resource utilization while ensuring the latency requirement. First, we analyze the distinct features of microservices in Alibaba's cluster compared to traditional applications. Then we present Alibaba's resource capacity provisioning workflow and framework to address challenges in resource provisioning for large‐scale and latency‐critical microservice clusters. Finally, we propose enhanced resource provisioning algorithms over Alibaba's current practice by making both proactive and reactive scheduling decisions based on different workloads patterns, which can improve resource usage by 10%–15% in Alibaba's clusters, while maintaining the necessary latency for microservices.\",\"PeriodicalId\":21899,\"journal\":{\"name\":\"Software: Practice and Experience\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Software: Practice and Experience\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1002/spe.3271\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Software: Practice and Experience","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1002/spe.3271","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

由于开发应用程序需要速度和灵活性,云原生架构对当今的云计算环境变得越来越重要。它利用微服务技术将传统的单片应用程序分解为轻量级和自包含的微服务组件。然而,随着微服务规模的增长和动态的相互依赖,它们也在资源供应方面提出了新的挑战,传统的资源调度方法无法完全解决这些挑战。具有不同资源需求和延迟需求的各种微服务可以创建复杂的调用链,这使得很难在保持链中整体服务质量的同时为每个组件提供细粒度和准确的资源分配。阿里云全面采用云原生和微服务技术来推动其关键业务和场景,包括双11购物节。在这项工作中,我们旨在解决如何有效地为不断增长的微服务平台规模提供资源并确保延迟关键微服务的性能的研究问题。为了解决这个问题,我们对阿里巴巴的微服务集群进行了深入分析,并提出了优化的资源分配算法,以提高资源利用率,同时确保延迟要求。首先,我们分析了阿里巴巴集群中微服务与传统应用相比的独特特征。然后,我们介绍了阿里巴巴的资源容量配置工作流和框架,以解决大规模和延迟关键型微服务集群的资源配置挑战。最后,我们提出了增强的资源配置算法,通过根据不同的工作负载模式做出主动和被动的调度决策,可以将阿里巴巴集群的资源利用率提高10%-15%,同时保持微服务的必要延迟。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Practice of Alibaba cloud on elastic resource provisioning for large‐scale microservices cluster
Summary Cloud‐native architecture is becoming increasingly crucial for today's cloud computing environments due to the need for speed and flexibility in developing applications. It utilizes microservice technology to break down traditional monolithic applications into light‐weight and self‐contained microservice components. However, as microservices grow in scale and have dynamic inter‐dependencies, they also pose new challenges in resource provisioning that cannot be fully addressed by traditional resource scheduling approaches. The various microservices with different resource demands and latency requirements can create complex calling chains, making it difficult to provide fine‐grained and accurate resource allocation to each component while maintaining the overall quality of service in the chain. Alibaba Cloud has fully embraced cloud‐native and microservice technologies to drive its key business and scenarios, including Double 11 Shopping Festival. In this work, we aim to address the research problem on how to efficiently provision resources for the growing scale of microservice platform and ensure the performance of latency‐critical microservices. To address the problem, we present in‐depth analyses of Alibaba's microservice cluster and propose optimized resource provisioning algorithms to enhance resource utilization while ensuring the latency requirement. First, we analyze the distinct features of microservices in Alibaba's cluster compared to traditional applications. Then we present Alibaba's resource capacity provisioning workflow and framework to address challenges in resource provisioning for large‐scale and latency‐critical microservice clusters. Finally, we propose enhanced resource provisioning algorithms over Alibaba's current practice by making both proactive and reactive scheduling decisions based on different workloads patterns, which can improve resource usage by 10%–15% in Alibaba's clusters, while maintaining the necessary latency for microservices.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信