Hardware and application aware performance, power and energy models for modern HPC servers with DVFS

IF 3.8 3区 计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Georges Da Costa
{"title":"Hardware and application aware performance, power and energy models for modern HPC servers with DVFS","authors":"Georges Da Costa","doi":"10.1016/j.suscom.2025.101106","DOIUrl":null,"url":null,"abstract":"<div><div>Energy usage and its ecological impact is now a major concern in High Performance Computing (HPC). To optimize supercomputers efficiency, researchers rely on models, as accessing actual platform is complex and costly. Changing DVFS (Dynamic Voltage and Frequency Scaling) is the most studied method, but it impacts power, performance and energy in a complex way.</div><div>We propose to bridge the gap between the theoretical and the practical approaches. We propose a multi cluster, multi application model accurately describing from a theoretical point of view the power and performance of applications subject to DVFS. We show how to use it on a runtime system with a minimal overhead, using only a few hardware performance counters and RAPL (Running Average Power Limit).</div><div>We validate our models using an extensive dataset, obtained using 18 different clusters and running 9 benchmarks. We also show how such model can be used to optimize the energy-to-solution for HPC workload.</div></div>","PeriodicalId":48686,"journal":{"name":"Sustainable Computing-Informatics & Systems","volume":"46 ","pages":"Article 101106"},"PeriodicalIF":3.8000,"publicationDate":"2025-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Sustainable Computing-Informatics & Systems","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2210537925000265","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Energy usage and its ecological impact is now a major concern in High Performance Computing (HPC). To optimize supercomputers efficiency, researchers rely on models, as accessing actual platform is complex and costly. Changing DVFS (Dynamic Voltage and Frequency Scaling) is the most studied method, but it impacts power, performance and energy in a complex way.
We propose to bridge the gap between the theoretical and the practical approaches. We propose a multi cluster, multi application model accurately describing from a theoretical point of view the power and performance of applications subject to DVFS. We show how to use it on a runtime system with a minimal overhead, using only a few hardware performance counters and RAPL (Running Average Power Limit).
We validate our models using an extensive dataset, obtained using 18 different clusters and running 9 benchmarks. We also show how such model can be used to optimize the energy-to-solution for HPC workload.
具有DVFS的现代HPC服务器的硬件和应用程序感知性能、功率和能源模型
能源使用及其生态影响现在是高性能计算(HPC)的主要关注点。为了优化超级计算机的效率,研究人员依赖于模型,因为访问实际平台复杂且昂贵。动态电压和频率缩放(DVFS)是目前研究最多的一种方法,但它对功率、性能和能量的影响非常复杂。我们建议弥合理论和实践方法之间的差距。我们提出了一个多集群、多应用的模型,从理论上准确地描述了受DVFS影响的应用的功率和性能。我们将展示如何在运行时系统中以最小的开销使用它,仅使用几个硬件性能计数器和RAPL(运行平均功率限制)。我们使用广泛的数据集来验证我们的模型,该数据集使用18个不同的集群并运行9个基准测试。我们还展示了如何使用这种模型来优化HPC工作负载的能量到解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
Sustainable Computing-Informatics & Systems
Sustainable Computing-Informatics & Systems COMPUTER SCIENCE, HARDWARE & ARCHITECTUREC-COMPUTER SCIENCE, INFORMATION SYSTEMS
CiteScore
10.70
自引率
4.40%
发文量
142
期刊介绍: Sustainable computing is a rapidly expanding research area spanning the fields of computer science and engineering, electrical engineering as well as other engineering disciplines. The aim of Sustainable Computing: Informatics and Systems (SUSCOM) is to publish the myriad research findings related to energy-aware and thermal-aware management of computing resource. Equally important is a spectrum of related research issues such as applications of computing that can have ecological and societal impacts. SUSCOM publishes original and timely research papers and survey articles in current areas of power, energy, temperature, and environment related research areas of current importance to readers. SUSCOM has an editorial board comprising prominent researchers from around the world and selects competitively evaluated peer-reviewed papers.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信