{"title":"Fast Decentralized Power Capping for Server Clusters","authors":"R. Azimi, Masoud Badiei, Xin Zhan, Na Li, S. Reda","doi":"10.1109/HPCA.2017.49","DOIUrl":null,"url":null,"abstract":"Power capping is a mechanism to ensure that the power consumption of clusters does not exceed the provisioned resources. A fast power capping method allows for a safe over-subscription of the rated power distribution devices, provides equipment protection, and enables large clusters to participate in demand-response programs. However, current methods have a slow response time with a large actuation latency when applied across a large number of servers as they rely on hierarchical management systems. We propose a fast decentralized power capping (DPC) technique that reduces the actuation latency by localizing power management at each server. The DPC method is based on a maximum throughput optimization formulation that takes into account the workloads priorities as well as the capacity of circuit breakers. Therefore, DPC significantly improves the cluster performance compared to alternative heuristics. We implement the proposed decentralized power management scheme on a real computing cluster. Compared to state-of-the-art hierarchical methods, DPC reduces the actuation latency by 72% up to 86% depending on the cluster size. In addition, DPC improves the system throughput performance by 16%, while using only 0.02% of the available network bandwidth. We describe how to minimize the overhead of each local DPC agent to a negligible amount. We also quantify the traffic and fault resilience of our decentralized power capping approach.","PeriodicalId":118950,"journal":{"name":"2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2017.49","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 17
Abstract
Power capping is a mechanism to ensure that the power consumption of clusters does not exceed the provisioned resources. A fast power capping method allows for a safe over-subscription of the rated power distribution devices, provides equipment protection, and enables large clusters to participate in demand-response programs. However, current methods have a slow response time with a large actuation latency when applied across a large number of servers as they rely on hierarchical management systems. We propose a fast decentralized power capping (DPC) technique that reduces the actuation latency by localizing power management at each server. The DPC method is based on a maximum throughput optimization formulation that takes into account the workloads priorities as well as the capacity of circuit breakers. Therefore, DPC significantly improves the cluster performance compared to alternative heuristics. We implement the proposed decentralized power management scheme on a real computing cluster. Compared to state-of-the-art hierarchical methods, DPC reduces the actuation latency by 72% up to 86% depending on the cluster size. In addition, DPC improves the system throughput performance by 16%, while using only 0.02% of the available network bandwidth. We describe how to minimize the overhead of each local DPC agent to a negligible amount. We also quantify the traffic and fault resilience of our decentralized power capping approach.