A Scalable Framework for Online Power Modelling of High-Performance Computing Nodes in Production

Federico Pittino, Francesco Beneventi, Andrea Bartolini, L. Benini
{"title":"A Scalable Framework for Online Power Modelling of High-Performance Computing Nodes in Production","authors":"Federico Pittino, Francesco Beneventi, Andrea Bartolini, L. Benini","doi":"10.1109/HPCS.2018.00058","DOIUrl":null,"url":null,"abstract":"Power and thermal design and management are critical components of high performance computing (HPC) systems, due to their cutting-edge position in terms of high power density and large total power consumption. Many HPC power manage¬ment strategies rely on the availability of accurate compact power models, capable of predicting power consumption and tracking its sensitivity to workload parameters and operating points. In this paper we describe a methodology and a framework for training power models derived with two of the best-in-class procedures directly on the online in production nodes and without requiring dedicated training instances. The compact power models are obtained using an online regression-based approach which can track non-stationary workloads and hardware variability. Our experiments on a real-life HPC system demonstrate that the models achieve very high accuracy over all operating modes. We also demonstrate the scalability of our approach and the small amount of resources needed for the online modeling, for both the training and inference phases.","PeriodicalId":308138,"journal":{"name":"2018 International Conference on High Performance Computing & Simulation (HPCS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS.2018.00058","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Power and thermal design and management are critical components of high performance computing (HPC) systems, due to their cutting-edge position in terms of high power density and large total power consumption. Many HPC power manage¬ment strategies rely on the availability of accurate compact power models, capable of predicting power consumption and tracking its sensitivity to workload parameters and operating points. In this paper we describe a methodology and a framework for training power models derived with two of the best-in-class procedures directly on the online in production nodes and without requiring dedicated training instances. The compact power models are obtained using an online regression-based approach which can track non-stationary workloads and hardware variability. Our experiments on a real-life HPC system demonstrate that the models achieve very high accuracy over all operating modes. We also demonstrate the scalability of our approach and the small amount of resources needed for the online modeling, for both the training and inference phases.
生产中高性能计算节点在线功率建模的可扩展框架
功率和热设计和管理是高性能计算(HPC)系统的关键组成部分,因为它们在高功率密度和大总功耗方面处于领先地位。许多高性能计算电源管理策略依赖于精确的紧凑电源模型,能够预测功耗并跟踪其对工作负载参数和工作点的敏感性。在本文中,我们描述了一种方法和框架,用于直接在在线生产节点上使用两个同类最佳程序派生的训练功率模型,而不需要专门的训练实例。使用基于在线回归的方法获得紧凑功率模型,该方法可以跟踪非平稳工作负载和硬件可变性。我们在实际的高性能计算系统上的实验表明,该模型在所有工作模式下都具有很高的精度。我们还演示了我们的方法的可扩展性,以及在线建模所需的少量资源,无论是在训练阶段还是在推理阶段。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信