{"title":"pMACH: Power and Migration Aware Container scHeduling","authors":"Sourav Panda, K. Ramakrishnan, L. Bhuyan","doi":"10.1109/ICNP52444.2021.9651911","DOIUrl":null,"url":null,"abstract":"Data center workload fluctuations need periodic, but careful scheduling to minimize power consumption while meeting the task completion time requirements. Existing data center scheduling systems tightly pack containers to save power. However, with the growth of multi-tiered applications, there is a significant need to account for the affinity between application components, to minimize communication overheads and latency. Centralized container scheduling systems using graph partitioning algorithms cause a significant number of task migrations, with associated downtime.We design pMACH, a novel distributed container scheduling scheme for optimizing both power and task completion time in data centers. It minimizes task migrations and packs frequently communicating containers together without overloading servers. pMACH operates at peak energy efficiency, thus reducing energy consumption while also providing greater headroom for unpredictable workload spikes. We also propose in-network monitoring using smartNICs (sNIC) to measure the communications and then perform scheduling in a hierarchical, parallelized framework to achieve high performance and scalability. pMACH is based on incremental partitioning and it leverages the previous scheduling decision to significantly reduce the number of containers moved between servers, avoiding application downtime.Both testbed measurements and large-scale trace-driven simulations show that pMACH saves at least 13.44% more power compared to previous scheduling systems. It speeds task completion, reducing the 95th percentile by a factor of 1.76-2.11 compared to existing container scheduling schemes. Compared to other static graph-based approaches, our incremental partitioning technique reduces migrations per epoch by 82%.","PeriodicalId":343813,"journal":{"name":"2021 IEEE 29th International Conference on Network Protocols (ICNP)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 29th International Conference on Network Protocols (ICNP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICNP52444.2021.9651911","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Data center workload fluctuations need periodic, but careful scheduling to minimize power consumption while meeting the task completion time requirements. Existing data center scheduling systems tightly pack containers to save power. However, with the growth of multi-tiered applications, there is a significant need to account for the affinity between application components, to minimize communication overheads and latency. Centralized container scheduling systems using graph partitioning algorithms cause a significant number of task migrations, with associated downtime.We design pMACH, a novel distributed container scheduling scheme for optimizing both power and task completion time in data centers. It minimizes task migrations and packs frequently communicating containers together without overloading servers. pMACH operates at peak energy efficiency, thus reducing energy consumption while also providing greater headroom for unpredictable workload spikes. We also propose in-network monitoring using smartNICs (sNIC) to measure the communications and then perform scheduling in a hierarchical, parallelized framework to achieve high performance and scalability. pMACH is based on incremental partitioning and it leverages the previous scheduling decision to significantly reduce the number of containers moved between servers, avoiding application downtime.Both testbed measurements and large-scale trace-driven simulations show that pMACH saves at least 13.44% more power compared to previous scheduling systems. It speeds task completion, reducing the 95th percentile by a factor of 1.76-2.11 compared to existing container scheduling schemes. Compared to other static graph-based approaches, our incremental partitioning technique reduces migrations per epoch by 82%.