基于paragon的异构数据中心qos感知调度

IF 1.8 4区计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS

ACM Transactions on Computer Systems Pub Date : 2013-12-01 DOI:10.1145/2556583

Christina Delimitrou, C. Kozyrakis

{"title":"基于paragon的异构数据中心qos感知调度","authors":"Christina Delimitrou, C. Kozyrakis","doi":"10.1145/2556583","DOIUrl":null,"url":null,"abstract":"Large-scale datacenters (DCs) host tens of thousands of diverse applications each day. However, interference between colocated workloads and the difficulty of matching applications to one of the many hardware platforms available can degrade performance, violating the quality of service (QoS) guarantees that many cloud workloads require. While previous work has identified the impact of heterogeneity and interference, existing solutions are computationally intensive, cannot be applied online, and do not scale beyond a few applications.\n We present Paragon, an online and scalable DC scheduler that is heterogeneity- and interference-aware. Paragon is derived from robust analytical methods, and instead of profiling each application in detail, it leverages information the system already has about applications it has previously seen. It uses collaborative filtering techniques to quickly and accurately classify an unknown incoming workload with respect to heterogeneity and interference in multiple shared resources. It does so by identifying similarities to previously scheduled applications. The classification allows Paragon to greedily schedule applications in a manner that minimizes interference and maximizes server utilization. After the initial application placement, Paragon monitors application behavior and adjusts the scheduling decisions at runtime to avoid performance degradations. Additionally, we design ARQ, a multiclass admission control protocol that constrains application waiting time. ARQ queues applications in separate classes based on the type of resources they need and avoids long queueing delays for easy-to-satisfy workloads in highly-loaded scenarios. Paragon scales to tens of thousands of servers and applications with marginal scheduling overheads in terms of time or state.\n We evaluate Paragon with a wide range of workload scenarios, on both small and large-scale systems, including 1,000 servers on EC2. For a 2,500-workload scenario, Paragon enforces performance guarantees for 91% of applications, while significantly improving utilization. In comparison, heterogeneity-oblivious, interference-oblivious, and least-loaded schedulers only provide similar guarantees for 14%, 11%, and 3% of workloads. The differences are more striking in oversubscribed scenarios where resource efficiency is more critical.","PeriodicalId":50918,"journal":{"name":"ACM Transactions on Computer Systems","volume":"21 1","pages":"12"},"PeriodicalIF":1.8000,"publicationDate":"2013-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"153","resultStr":"{\"title\":\"QoS-Aware scheduling in heterogeneous datacenters with paragon\",\"authors\":\"Christina Delimitrou, C. Kozyrakis\",\"doi\":\"10.1145/2556583\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Large-scale datacenters (DCs) host tens of thousands of diverse applications each day. However, interference between colocated workloads and the difficulty of matching applications to one of the many hardware platforms available can degrade performance, violating the quality of service (QoS) guarantees that many cloud workloads require. While previous work has identified the impact of heterogeneity and interference, existing solutions are computationally intensive, cannot be applied online, and do not scale beyond a few applications.\\n We present Paragon, an online and scalable DC scheduler that is heterogeneity- and interference-aware. Paragon is derived from robust analytical methods, and instead of profiling each application in detail, it leverages information the system already has about applications it has previously seen. It uses collaborative filtering techniques to quickly and accurately classify an unknown incoming workload with respect to heterogeneity and interference in multiple shared resources. It does so by identifying similarities to previously scheduled applications. The classification allows Paragon to greedily schedule applications in a manner that minimizes interference and maximizes server utilization. After the initial application placement, Paragon monitors application behavior and adjusts the scheduling decisions at runtime to avoid performance degradations. Additionally, we design ARQ, a multiclass admission control protocol that constrains application waiting time. ARQ queues applications in separate classes based on the type of resources they need and avoids long queueing delays for easy-to-satisfy workloads in highly-loaded scenarios. Paragon scales to tens of thousands of servers and applications with marginal scheduling overheads in terms of time or state.\\n We evaluate Paragon with a wide range of workload scenarios, on both small and large-scale systems, including 1,000 servers on EC2. For a 2,500-workload scenario, Paragon enforces performance guarantees for 91% of applications, while significantly improving utilization. In comparison, heterogeneity-oblivious, interference-oblivious, and least-loaded schedulers only provide similar guarantees for 14%, 11%, and 3% of workloads. The differences are more striking in oversubscribed scenarios where resource efficiency is more critical.\",\"PeriodicalId\":50918,\"journal\":{\"name\":\"ACM Transactions on Computer Systems\",\"volume\":\"21 1\",\"pages\":\"12\"},\"PeriodicalIF\":1.8000,\"publicationDate\":\"2013-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"153\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Computer Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1145/2556583\",\"RegionNum\":4,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, THEORY & METHODS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Computer Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1145/2556583","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, THEORY & METHODS","Score":null,"Total":0}

引用次数: 153

摘要

大型数据中心(dc)每天托管数以万计的各种应用程序。然而，托管工作负载之间的干扰以及将应用程序匹配到可用的众多硬件平台之一的困难可能会降低性能，违反许多云工作负载所需的服务质量(QoS)保证。虽然以前的工作已经确定了异构性和干扰的影响，但现有的解决方案是计算密集型的，不能在线应用，并且不能扩展到少数应用程序之外。我们提出了Paragon，一个在线和可扩展的数据中心调度程序，它具有异构和干扰意识。Paragon来源于健壮的分析方法，它不是详细分析每个应用程序，而是利用系统已经拥有的关于它以前看到的应用程序的信息。它采用协同过滤技术，针对多个共享资源的异构性和干扰性，快速准确地对未知传入工作负载进行分类。它通过识别与先前调度的应用程序的相似之处来实现这一点。这种分类允许Paragon以最小化干扰和最大化服务器利用率的方式贪婪地调度应用程序。在初始应用程序放置之后，Paragon监视应用程序行为，并在运行时调整调度决策，以避免性能下降。此外，我们还设计了ARQ协议，这是一个限制申请等待时间的多类准入控制协议。ARQ根据应用程序所需的资源类型在单独的类中排队，并避免长时间排队延迟，以满足高负载场景中易于满足的工作负载。Paragon可以扩展到数以万计的服务器和应用程序，在时间或状态方面的调度开销很小。我们对Paragon进行了广泛的工作负载场景评估，包括小型和大型系统，包括EC2上的1,000台服务器。对于2500个工作负载的场景，Paragon为91%的应用程序提供了性能保证，同时显著提高了利用率。相比之下，异构无关、干扰无关和最小负载调度器仅为14%、11%和3%的工作负载提供类似的保证。在资源效率更为关键的超额订阅场景中，差异更为显著。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

QoS-Aware scheduling in heterogeneous datacenters with paragon

Large-scale datacenters (DCs) host tens of thousands of diverse applications each day. However, interference between colocated workloads and the difficulty of matching applications to one of the many hardware platforms available can degrade performance, violating the quality of service (QoS) guarantees that many cloud workloads require. While previous work has identified the impact of heterogeneity and interference, existing solutions are computationally intensive, cannot be applied online, and do not scale beyond a few applications. We present Paragon, an online and scalable DC scheduler that is heterogeneity- and interference-aware. Paragon is derived from robust analytical methods, and instead of profiling each application in detail, it leverages information the system already has about applications it has previously seen. It uses collaborative filtering techniques to quickly and accurately classify an unknown incoming workload with respect to heterogeneity and interference in multiple shared resources. It does so by identifying similarities to previously scheduled applications. The classification allows Paragon to greedily schedule applications in a manner that minimizes interference and maximizes server utilization. After the initial application placement, Paragon monitors application behavior and adjusts the scheduling decisions at runtime to avoid performance degradations. Additionally, we design ARQ, a multiclass admission control protocol that constrains application waiting time. ARQ queues applications in separate classes based on the type of resources they need and avoids long queueing delays for easy-to-satisfy workloads in highly-loaded scenarios. Paragon scales to tens of thousands of servers and applications with marginal scheduling overheads in terms of time or state. We evaluate Paragon with a wide range of workload scenarios, on both small and large-scale systems, including 1,000 servers on EC2. For a 2,500-workload scenario, Paragon enforces performance guarantees for 91% of applications, while significantly improving utilization. In comparison, heterogeneity-oblivious, interference-oblivious, and least-loaded schedulers only provide similar guarantees for 14%, 11%, and 3% of workloads. The differences are more striking in oversubscribed scenarios where resource efficiency is more critical.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Transactions on Computer Systems 工程技术-计算机：理论方法

CiteScore

4.00

自引率

0.00%

发文量

审稿时长

1 months

期刊介绍： ACM Transactions on Computer Systems (TOCS) presents research and development results on the design, implementation, analysis, evaluation, and use of computer systems and systems software. The term "computer systems" is interpreted broadly and includes operating systems, systems architecture and hardware, distributed systems, optimizing compilers, and the interaction between systems and computer networks. Articles appearing in TOCS will tend either to present new techniques and concepts, or to report on experiences and experiments with actual systems. Insights useful to system designers, builders, and users will be emphasized. TOCS publishes research and technical papers, both short and long. It includes technical correspondence to permit commentary on technical topics and on previously published papers.