Resource allocation and consolidation in a multi-core server cluster using a Markov decision process model

International Symposium on Quality Electronic Design (ISQED) Pub Date : 2013-03-04 DOI:10.1109/ISQED.2013.6523677

Yanzhi Wang, Shuang Chen, H. Goudarzi, Massoud Pedram

{"title":"Resource allocation and consolidation in a multi-core server cluster using a Markov decision process model","authors":"Yanzhi Wang, Shuang Chen, H. Goudarzi, Massoud Pedram","doi":"10.1109/ISQED.2013.6523677","DOIUrl":null,"url":null,"abstract":"Distributed computing systems have attracted a lot of attention due to increasing demand for high performance computing and storage. Resource allocation is one of the most important challenges in the distributed systems especially when the clients have some Service Level Agreements (SLAs) and the total profit depends on how the system can meet these SLAs. In this paper, an SLA-based resource allocation problem in a server cluster is considered. The objective is to maximize the total profit, which is the total price gained from serving the clients subtracted by the operation cost of the server cluster. The total price depends on the average request response time for each client as defined in their utility functions, while the operating cost is related to the total energy consumption. A joint optimization framework is proposed, comprised of request dispatching, dynamic voltage and frequency scaling (DVFS) for individual cores, as well as server-level and core-level consolidations. Each core in the cluster is modeled using a continuous-time Markov decision process (CTMDP). A near-optimal hierarchical solution is proposed, consisting of a central manager and distributed local agents. Each local agent employs linear programming-based CTMDP solving method to solve the DVFS problem for the corresponding core. The central manager solves the request dispatching problem and finds the optimal number of turned on cores and servers for request processing, thereby achieving a desirable tradeoff between service request response time and power consumption. Experimental results demonstrate that the proposed near-optimal resource allocation and consolidation algorithm consistently outperforms baseline algorithms.","PeriodicalId":127115,"journal":{"name":"International Symposium on Quality Electronic Design (ISQED)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Symposium on Quality Electronic Design (ISQED)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISQED.2013.6523677","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

Distributed computing systems have attracted a lot of attention due to increasing demand for high performance computing and storage. Resource allocation is one of the most important challenges in the distributed systems especially when the clients have some Service Level Agreements (SLAs) and the total profit depends on how the system can meet these SLAs. In this paper, an SLA-based resource allocation problem in a server cluster is considered. The objective is to maximize the total profit, which is the total price gained from serving the clients subtracted by the operation cost of the server cluster. The total price depends on the average request response time for each client as defined in their utility functions, while the operating cost is related to the total energy consumption. A joint optimization framework is proposed, comprised of request dispatching, dynamic voltage and frequency scaling (DVFS) for individual cores, as well as server-level and core-level consolidations. Each core in the cluster is modeled using a continuous-time Markov decision process (CTMDP). A near-optimal hierarchical solution is proposed, consisting of a central manager and distributed local agents. Each local agent employs linear programming-based CTMDP solving method to solve the DVFS problem for the corresponding core. The central manager solves the request dispatching problem and finds the optimal number of turned on cores and servers for request processing, thereby achieving a desirable tradeoff between service request response time and power consumption. Experimental results demonstrate that the proposed near-optimal resource allocation and consolidation algorithm consistently outperforms baseline algorithms.

查看原文本刊更多论文

使用马尔可夫决策过程模型的多核服务器集群中的资源分配和整合

由于对高性能计算和存储的需求日益增长，分布式计算系统引起了人们的广泛关注。资源分配是分布式系统中最重要的挑战之一，特别是当客户端有一些服务水平协议(sla)，而总利润取决于系统如何满足这些sla时。本文研究了服务器集群中基于sla的资源分配问题。目标是使总利润最大化，即服务客户获得的总价格减去服务器集群的运营成本。总价格取决于每个客户的平均请求响应时间(在其效用函数中定义)，而运营成本与总能耗相关。提出了一个联合优化框架，包括请求调度、单个核的动态电压和频率缩放(DVFS)以及服务器级和核心级合并。集群中的每个核心都使用连续时间马尔可夫决策过程(CTMDP)建模。提出了一种近似最优的分层解决方案，该方案由一个中央管理器和分布式本地代理组成。每个局部代理采用基于线性规划的CTMDP求解方法求解相应核心的DVFS问题。中央管理器解决请求调度问题，并找到用于处理请求的核心和服务器的最优数量，从而在服务请求响应时间和功耗之间实现理想的权衡。实验结果表明，本文提出的近最优资源分配和整合算法始终优于基线算法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Symposium on Quality Electronic Design (ISQED)

自引率

0.00%

发文量