InfiniBand集群的功耗感知集体通信算法设计

2010 39th International Conference on Parallel Processing Pub Date : 2010-09-13 DOI:10.1109/ICPP.2010.78

K. Kandalla, E. Mancini, S. Sur, D. Panda

{"title":"InfiniBand集群的功耗感知集体通信算法设计","authors":"K. Kandalla, E. Mancini, S. Sur, D. Panda","doi":"10.1109/ICPP.2010.78","DOIUrl":null,"url":null,"abstract":"Modern supercomputing systems have witnessed a phenomenal growth in the recent history owing to the advent of multi-core architectures and high speed networks. However, the operational and maintenance costs of these systems have also grown rapidly. Several concepts such as Dynamic Voltage and Frequency Scaling (DVFS) and CPU Throttling have been proposed to conserve the power consumed by the compute nodes during idle periods. However, it is necessary to design software stacks in a power-aware manner to minimize the amount of power drawn by the system during the execution of applications. It is also critical to minimize the performance overheads associated with power-aware algorithms, as the benefits of saving power could be lost if the application runs for a longer time. Modern multi-core architectures such as the Intel “Nehalem” allow for DVFS and CPU throttling operations to be performed with little overheads. In this paper, we explore how these features can be leveraged to design algorithms to deliver fine-grained power savings during the communication phases of parallel applications. We also propose a theoretical model to analyze the power consumption characteristics of communication operations. We use microbenchmarks and application benchmarks such as NAS and CPMD to measure the performance of our proposed algorithms and to demonstrate the potential for saving power with 32 and 64 processes. We observe about 8% improvement in the overall energy consumed by these applications with little performance overheads.","PeriodicalId":180554,"journal":{"name":"2010 39th International Conference on Parallel Processing","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"42","resultStr":"{\"title\":\"Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters\",\"authors\":\"K. Kandalla, E. Mancini, S. Sur, D. Panda\",\"doi\":\"10.1109/ICPP.2010.78\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern supercomputing systems have witnessed a phenomenal growth in the recent history owing to the advent of multi-core architectures and high speed networks. However, the operational and maintenance costs of these systems have also grown rapidly. Several concepts such as Dynamic Voltage and Frequency Scaling (DVFS) and CPU Throttling have been proposed to conserve the power consumed by the compute nodes during idle periods. However, it is necessary to design software stacks in a power-aware manner to minimize the amount of power drawn by the system during the execution of applications. It is also critical to minimize the performance overheads associated with power-aware algorithms, as the benefits of saving power could be lost if the application runs for a longer time. Modern multi-core architectures such as the Intel “Nehalem” allow for DVFS and CPU throttling operations to be performed with little overheads. In this paper, we explore how these features can be leveraged to design algorithms to deliver fine-grained power savings during the communication phases of parallel applications. We also propose a theoretical model to analyze the power consumption characteristics of communication operations. We use microbenchmarks and application benchmarks such as NAS and CPMD to measure the performance of our proposed algorithms and to demonstrate the potential for saving power with 32 and 64 processes. We observe about 8% improvement in the overall energy consumed by these applications with little performance overheads.\",\"PeriodicalId\":180554,\"journal\":{\"name\":\"2010 39th International Conference on Parallel Processing\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"42\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 39th International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2010.78\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 39th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2010.78","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 42

摘要

由于多核架构和高速网络的出现，现代超级计算系统在最近的历史中出现了惊人的增长。然而，这些系统的操作和维护成本也在迅速增长。提出了动态电压和频率缩放(DVFS)和CPU节流等概念，以节省计算节点在空闲期间消耗的功率。然而，有必要以功率感知的方式设计软件堆栈，以尽量减少系统在执行应用程序期间消耗的电量。最小化与功耗感知算法相关的性能开销也很关键，因为如果应用程序运行较长时间，可能会失去节省功耗的好处。现代多核架构，如Intel“Nehalem”，允许以很少的开销执行DVFS和CPU节流操作。在本文中，我们将探讨如何利用这些特性来设计算法，从而在并行应用程序的通信阶段提供细粒度的节能。我们还提出了一个理论模型来分析通信操作的功耗特性。我们使用微基准测试和应用程序基准测试(如NAS和CPMD)来测量我们提出的算法的性能，并演示32和64进程的节能潜力。我们观察到，这些应用程序的总能耗提高了8%，而性能开销很小。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Designing Power-Aware Collective Communication Algorithms for InfiniBand Clusters

Modern supercomputing systems have witnessed a phenomenal growth in the recent history owing to the advent of multi-core architectures and high speed networks. However, the operational and maintenance costs of these systems have also grown rapidly. Several concepts such as Dynamic Voltage and Frequency Scaling (DVFS) and CPU Throttling have been proposed to conserve the power consumed by the compute nodes during idle periods. However, it is necessary to design software stacks in a power-aware manner to minimize the amount of power drawn by the system during the execution of applications. It is also critical to minimize the performance overheads associated with power-aware algorithms, as the benefits of saving power could be lost if the application runs for a longer time. Modern multi-core architectures such as the Intel “Nehalem” allow for DVFS and CPU throttling operations to be performed with little overheads. In this paper, we explore how these features can be leveraged to design algorithms to deliver fine-grained power savings during the communication phases of parallel applications. We also propose a theoretical model to analyze the power consumption characteristics of communication operations. We use microbenchmarks and application benchmarks such as NAS and CPMD to measure the performance of our proposed algorithms and to demonstrate the potential for saving power with 32 and 64 processes. We observe about 8% improvement in the overall energy consumed by these applications with little performance overheads.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 39th International Conference on Parallel Processing

自引率

0.00%

发文量