International Workshop on Energy Efficient Supercomputing最新文献

筛选
英文 中文
Adaptive precision solvers for sparse linear systems 稀疏线性系统的自适应精确求解方法
International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834802
H. Anzt, J. Dongarra, E. S. Quintana‐Ortí
{"title":"Adaptive precision solvers for sparse linear systems","authors":"H. Anzt, J. Dongarra, E. S. Quintana‐Ortí","doi":"10.1145/2834800.2834802","DOIUrl":"https://doi.org/10.1145/2834800.2834802","url":null,"abstract":"We formulate an implementation of a Jacobi iterative solver for sparse linear systems that iterates the distinct components of the solution with different precision in terms of mantissa length. Starting with very low accuracy, and using an inexpensive test, our technique extends the mantissa length for those component updates when and where this is required. Numerical experiments reveal that, for a solver that pursues IEEE double precision accuracy in the solution (i.e., mantissa of 52 binary digits), the precision required to reach convergence for the distinct components can differ significantly during the iteration so that, during most of this process, only a few components may require operating with the full length of the mantissa. Thus, with operations involving a longer mantissa yielding a higher power usage, energy savings can potentially be obtained by using a truncated format. Finally, we introduce a novel metric which quantifies the average mantissa length during the iteration, and exposes the resource savings of the Jacobi solver with adaptive mantissa.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127406447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Early experiences with node-level power capping on the Cray XC40 platform Cray XC40平台上节点级功率封顶的早期经验
International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834801
K. Pedretti, Stephen L. Olivier, Kurt B. Ferreira, G. Shipman, W. Shu
{"title":"Early experiences with node-level power capping on the Cray XC40 platform","authors":"K. Pedretti, Stephen L. Olivier, Kurt B. Ferreira, G. Shipman, W. Shu","doi":"10.1145/2834800.2834801","DOIUrl":"https://doi.org/10.1145/2834800.2834801","url":null,"abstract":"Power consumption of extreme-scale supercomputers has become a key performance bottleneck. Yet current practices do not leverage power management opportunities, instead running at \"maximum power\". This is not sustainable. Future systems will need to manage power as a critical resource, directing it to where it has greatest benefit. Power capping is one mechanism for managing power budgets, however its behavior is not well understood. This paper presents an empirical evaluation of several key HPC workloads running under a power cap on a Cray XC40 system, and provides a comparison of this technique with p-state control, demonstrating the performance differences of each. These results show: 1.) Maximum performance requires ensuring the cap is not reached; 2.) Performance slowdown under a cap can be attributed to cascading delays which result in unsynchronized performance variability across nodes; and, 3.) Due to lag in reaction time, considerable time is spent operating above the set cap. This work provides a timely and much needed comparison of HPC application performance under a power cap and attempts to enable users and system administrators to understand how to best optimize application performance on power-constrained HPC systems.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131175489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Towards an application-specific thermal energy model of current processors 面向当前处理器特定应用的热能模型
International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834805
V. Getov, D. Kerbyson, M. Macduff, A. Hoisie
{"title":"Towards an application-specific thermal energy model of current processors","authors":"V. Getov, D. Kerbyson, M. Macduff, A. Hoisie","doi":"10.1145/2834800.2834805","DOIUrl":"https://doi.org/10.1145/2834800.2834805","url":null,"abstract":"Recent developments of high-end processors recognize temperature monitoring and tuning as one of the main challenges towards achieving higher performance given the growing power and temperature constraints. To address this challenge, one needs both suitable thermal energy abstraction and corresponding instrumentation. Our model is based on application-specific parameters such as power consumption, execution time, and asymptotic temperature as well as hardware-specific parameters such as half time for thermal rise or fall. As observed with our out-of-band instrumentation and monitoring infrastructure, the temperature changes follow a relatively slow capacitor-style charge-discharge process. Therefore, we use the lumped thermal model that initiates an exponential process whenever there is a change in processor's power consumption. Initial experiments with two codes -- Firestarter and Nekbone -- validate our thermal energy model and demonstrate its use for analyzing and potentially improving the application-specific balance between temperature, power, and performance.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"206 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114300628","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Towards the development of hierarchical data motion power cost models 面向分层数据运动功耗模型的发展
International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834804
T. Mintz, Oluwatosin O. Alabi
{"title":"Towards the development of hierarchical data motion power cost models","authors":"T. Mintz, Oluwatosin O. Alabi","doi":"10.1145/2834800.2834804","DOIUrl":"https://doi.org/10.1145/2834800.2834804","url":null,"abstract":"Data intensive applications comprise a considerable portion of HPC center workloads. Whether large amounts of data transfer occur before, during or after an application is executed, this cost must be considered. Not just in terms of performance (e.g. time to completion), but also in terms of power consumed to complete these necessary tasks. At the system level, scheduling and resource management tools are capable of recording performance metrics and other constraints, and making performance aware decisions. These tools are a natural choice for making power aware decisions, as well. More specifically, power aware decisions about data transfer costs for the entire application workflow. This research focuses on developing data motion power cost models and integrating these models into a task scheduler framework to enable complete power aware scheduling of an entire HPC workflow. We have taken an incremental approach to developing a hierarchical, system wide power model for data motion that starts with core data motion and will eventually encompass data motion across facilities. In this paper, we discuss our current research which addresses multicore data motion and data motion between nodes.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127593161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Measurement and characterization of Haswell power and energy consumption Haswell功率和能耗的测量和表征
International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834807
Song Huang, M. Lang, S. Pakin, Song Fu
{"title":"Measurement and characterization of Haswell power and energy consumption","authors":"Song Huang, M. Lang, S. Pakin, Song Fu","doi":"10.1145/2834800.2834807","DOIUrl":"https://doi.org/10.1145/2834800.2834807","url":null,"abstract":"The recently introduced Intel Haswell processors implement major changes compared to their predecessors, especially with respect to power management. Haswell processors are used in the new-generation DOE NNSA tri-lab supercomputer, Trinity, hosted at Los Alamos National Laboratory. In this paper we measure and analyze a number of power-based parameter of Haswell that are of great importance for the energy consumption of applications. We study three HPC benchmarks, HPL, STREAM, FIRESTARTER and a hydrodynamics application, CLAMR. They are representative of workloads stressing different components of computers. Our experimental results show that real-time on-board power monitoring causes substantial power use if no optimization is performed; adapting P-states provides a cost-effective way to improve the power-performance of applications; enabling hyperthreading can significantly save energy by up to 96.3% for compute-bound applications; HPC applications should employ differentiated core affinity strategies in order to achieve the maximum power-performance. Moreover, we study the imbalance of sockets on a server in their power and energy use, and then propose approaches to mitigate such imbalance.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128152934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Compute bottlenecks on the new 64-bit ARM 新的64位ARM上的计算瓶颈
International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834806
Adam Jundt, Allyson Cauble-Chantrenne, Ananta Tiwari, Joshua Peraza, M. Laurenzano, L. Carrington
{"title":"Compute bottlenecks on the new 64-bit ARM","authors":"Adam Jundt, Allyson Cauble-Chantrenne, Ananta Tiwari, Joshua Peraza, M. Laurenzano, L. Carrington","doi":"10.1145/2834800.2834806","DOIUrl":"https://doi.org/10.1145/2834800.2834806","url":null,"abstract":"The trifecta of power, performance and programmability has spurred significant interest in the 64-bit ARMv8 platform. These new systems provide energy efficiency, a traditional CPU programming model, and the potential of high performance when enough cores are thrown at the problem. However, it remains unclear how well the ARM architecture will work as a design point for the High Performance Computing market. In this paper, we characterize and investigate the key architectural factors that impact power and performance on a current ARMv8 offering (X-Gene 1) and Intel's Sandy Bridge processor. Using Principal Component Analysis, multiple linear regression models, and variable importance analysis we conclude that the CPU frontend has the biggest impact on performance on both the X-Gene and Sandy Bridge processors.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133363405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Experimental design and comparative testing of a hybrid-cooled computer cluster 混合冷却计算机集群的实验设计与对比测试
International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834803
A. Bonnie
{"title":"Experimental design and comparative testing of a hybrid-cooled computer cluster","authors":"A. Bonnie","doi":"10.1145/2834800.2834803","DOIUrl":"https://doi.org/10.1145/2834800.2834803","url":null,"abstract":"With water cooling becoming an affordable option both at home and at scale, it is important to consider the possible benefits over air cooling. There are several methods of liquid cooling, notables include: immersion, cold water cooling, and warm water cooling. The total cost of ownership is difficult to determine with these options as each has a different impact on the data center. Considering retrofit, over a new data center, introduces unforeseen variables that make cost analysis a challenge. Besides the added costs of additional infrastructure, and the cost to remove old, the upfront costs could be daunting. Therefore a cost analysis would be a study of its own. This study however hopes to reveal the resulting tradeoffs in temperature, performance, and power usage presented in the case between classical airflow based heat sink mechanisms to water provided directly at the heat sink. Having control over a discrete chiller will provide answers to the CPU temperatures, power usage, and performance at various inlet water temperatures. To water or to air?","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"688 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116409053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Toward application-specific memory reconfiguration for energy efficiency 面向特定应用程序的内存重新配置,以提高能效
International Workshop on Energy Efficient Supercomputing Pub Date : 2013-11-17 DOI: 10.1145/2536430.2536434
Pietro Cicotti, L. Carrington, A. Chien
{"title":"Toward application-specific memory reconfiguration for energy efficiency","authors":"Pietro Cicotti, L. Carrington, A. Chien","doi":"10.1145/2536430.2536434","DOIUrl":"https://doi.org/10.1145/2536430.2536434","url":null,"abstract":"The end of Dennard scaling has made energy-efficiency a critical challenge in the continued increase of computing performance. An important approach to increasing energy-efficiency is hardware customization. In this study we explore the opportunity for energy-efficiency via memory hierarchy customization and present a methodology to identify application-specific energy efficient configurations. Using a workload of 37 diverse benchmarks, we address three key questions: 1) How much energy saving is possible?, 2) How much reconfiguration is required?, and 3) Can we use application characterization to automatically select an energy-optimal memory hierarchy configuration? Our results show that the potential benefit is large -- average reductions close to 70% in memory hierarchy energy with no performance loss. Further, our results show that the number of configurations need not be large; 13 carefully chosen configurations can deliver 93% of this benefit (64% energy reduction), and even coarse-grain reconfigurations of an existing hierarchy can deliver 81% of this benefit (56% energy reduction), suggesting that reconfigurable hierarchies may be practically realizable. Finally, as a first step towards automatic reconfiguration, we explore application characterization via reuse distance as a guide to select the best memory hierarchy configuration; we show that reuse distance can effectively predict the application-specific configuration which will both maintain performance and deliver energy efficiency.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125521993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Unified performance and power modeling of scientific workloads 科学工作负载的统一性能和功率建模
International Workshop on Energy Efficient Supercomputing Pub Date : 2013-11-17 DOI: 10.1145/2536430.2536435
S. Song, K. Barker, D. Kerbyson
{"title":"Unified performance and power modeling of scientific workloads","authors":"S. Song, K. Barker, D. Kerbyson","doi":"10.1145/2536430.2536435","DOIUrl":"https://doi.org/10.1145/2536430.2536435","url":null,"abstract":"It is expected that scientific applications executing on future large-scale HPC must be optimized not only in terms of performance, but also in terms of power consumption. As power and energy become increasingly constrained resources, researchers and developers must have access to tools that will allow for accurate prediction of both performance and power consumption. Reasoning about performance and power consumption in concert will be critical for achieving maximum utilization of limited resources on future HPC systems. To this end, we present a unified performance and power model for the Nek-Bone mini-application developed as part of the DOE's CESAR Exascale Co-Design Center. Our models consider the impact of computation, point-to-point communication, and collective communication individually and quantitatively predict their impact on both performance and energy efficiency. Further, these models are demonstrated to be accurate on currently available HPC system architectures. In this paper, we present our modeling methodology and performance and power models for the Nek-Bone mini-application. We present validation results that indicate the accuracy of these models.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123868728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 43
Leakage energy estimates for HPC applications 高性能计算应用的泄漏能量估算
International Workshop on Energy Efficient Supercomputing Pub Date : 2013-11-17 DOI: 10.1145/2536430.2536431
Aditya M. Deshpande, J. Draper
{"title":"Leakage energy estimates for HPC applications","authors":"Aditya M. Deshpande, J. Draper","doi":"10.1145/2536430.2536431","DOIUrl":"https://doi.org/10.1145/2536430.2536431","url":null,"abstract":"Large-scale high-performance systems are energy constrained. With thousands of processing cores at their disposal, these machines contain large amounts of on-chip caches. With a trend of decreasing thresholds in transistors, the amount of leakage current and energy losses has increased dramatically. Coupling the two trends, on-chip caches are responsible for a large portion of total leakage energy losses. In this work, we quantify the on-chip leakage energy losses across a wide set of applications. Our scheme profiles applications to measure cache accesses in order to estimate energy consumption across various levels of caches. Our study indicates that the leakage energy is the dominant form of energy dissipation in on-chip caches and may account for up to 80% of total cache energy, and this trend is expected to increase with every new generation of semiconductor process. Our results also suggest that compiler optimizations have a very limited effect on the total energy consumption of the caches and irrespective of the compiler optimizations, the problem of leakage in caches cannot be effectively addressed by software techniques but requires intervention at circuit and architectural levels. The problem of leakage in caches cannot be neglected in attacking the energy barrier to building exascale systems.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129436763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信