Evaluating Power-Monitoring Capabilities on IBM Blue Gene/P and Blue Gene/Q

Kazutomo Yoshii, K. Iskra, Rinku Gupta, P. Beckman, V. Vishwanath, Chenjie Yu, S. Coghlan
{"title":"Evaluating Power-Monitoring Capabilities on IBM Blue Gene/P and Blue Gene/Q","authors":"Kazutomo Yoshii, K. Iskra, Rinku Gupta, P. Beckman, V. Vishwanath, Chenjie Yu, S. Coghlan","doi":"10.1109/CLUSTER.2012.62","DOIUrl":null,"url":null,"abstract":"Power consumption is becoming a critical factor as we continue our quest toward exascale computing. Yet, actual power utilization of a complete system is an insufficiently studied research area. Estimating the power consumption of a large scale system is a nontrivial task because a large number of components are involved and because power requirements are affected by the (unpredictable) workloads. Clearly needed is a power-monitoring infrastructure that can provide timely and accurate feedback to system developers and application writers so that they can optimize the use of this precious resource. Many existing large-scale installations do feature power-monitoring sensors, however, those are part of environmental- and health monitoring sub systems and were not designed with application level power consumption measurements in mind. In this paper, we evaluate the existing power monitoring of IBM Blue Gene systems, with the goal of understanding what capabilities are available and how they fare with respect to spatial and temporal resolution, accuracy, latency, and other characteristics. We find that with a careful choice of dedicated micro benchmarks, we can obtain meaningful power consumption data even on Blue Gene/P, where the interval between available data points is measured in minutes. We next evaluate the monitoring subsystem on Blue Gene/Q, and are able to study the power characteristics of FPU and memory subsystems of Blue Gene/Q. We find the monitoring subsystem capable of providing second-scale resolution of power data conveniently separated between node components with seven seconds latency. This represents a significant improvement in power monitoring infrastructure, and hope future systems will enable real-time power measurement in order to better understand application behavior at a finer granularity.","PeriodicalId":143579,"journal":{"name":"2012 IEEE International Conference on Cluster Computing","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-09-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTER.2012.62","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

Power consumption is becoming a critical factor as we continue our quest toward exascale computing. Yet, actual power utilization of a complete system is an insufficiently studied research area. Estimating the power consumption of a large scale system is a nontrivial task because a large number of components are involved and because power requirements are affected by the (unpredictable) workloads. Clearly needed is a power-monitoring infrastructure that can provide timely and accurate feedback to system developers and application writers so that they can optimize the use of this precious resource. Many existing large-scale installations do feature power-monitoring sensors, however, those are part of environmental- and health monitoring sub systems and were not designed with application level power consumption measurements in mind. In this paper, we evaluate the existing power monitoring of IBM Blue Gene systems, with the goal of understanding what capabilities are available and how they fare with respect to spatial and temporal resolution, accuracy, latency, and other characteristics. We find that with a careful choice of dedicated micro benchmarks, we can obtain meaningful power consumption data even on Blue Gene/P, where the interval between available data points is measured in minutes. We next evaluate the monitoring subsystem on Blue Gene/Q, and are able to study the power characteristics of FPU and memory subsystems of Blue Gene/Q. We find the monitoring subsystem capable of providing second-scale resolution of power data conveniently separated between node components with seven seconds latency. This represents a significant improvement in power monitoring infrastructure, and hope future systems will enable real-time power measurement in order to better understand application behavior at a finer granularity.
评估IBM Blue Gene/P和Blue Gene/Q上的电源监控功能
随着我们继续追求百亿亿次计算,功耗正在成为一个关键因素。然而,完整系统的实际功率利用率是一个研究不足的研究领域。估计大规模系统的功耗是一项非常重要的任务,因为涉及大量组件,并且电源需求受到(不可预测的)工作负载的影响。显然,我们需要一个能够向系统开发人员和应用程序编写人员提供及时和准确反馈的电力监控基础设施,以便他们能够优化这种宝贵资源的使用。然而,许多现有的大型装置确实具有功率监测传感器,这些传感器是环境和健康监测子系统的一部分,并且在设计时没有考虑到应用级功耗测量。在本文中,我们评估了IBM Blue Gene系统的现有电源监控,目的是了解可用的功能以及它们在空间和时间分辨率、准确性、延迟和其他特征方面的表现。我们发现,通过仔细选择专用的微基准测试,我们甚至可以在Blue Gene/P上获得有意义的功耗数据,其中可用数据点之间的间隔以分钟为单位进行测量。接下来,我们对蓝基因/Q上的监控子系统进行了评估,并能够研究蓝基因/Q上FPU和内存子系统的功率特性。我们发现监测子系统能够提供二级分辨率的电力数据,方便地在节点组件之间分离,延迟时间为7秒。这代表了电力监控基础设施的重大改进,并希望未来的系统能够实现实时功率测量,以便更好地了解更细粒度的应用程序行为。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信