International Workshop on Energy Efficient Supercomputing最新文献

Adaptive precision solvers for sparse linear systems 稀疏线性系统的自适应精确求解方法

International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834802

H. Anzt, J. Dongarra, E. S. Quintana‐Ortí

引用次数: 11

Early experiences with node-level power capping on the Cray XC40 platform Cray XC40平台上节点级功率封顶的早期经验

International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834801

K. Pedretti, Stephen L. Olivier, Kurt B. Ferreira, G. Shipman, W. Shu

{"title":"Early experiences with node-level power capping on the Cray XC40 platform","authors":"K. Pedretti, Stephen L. Olivier, Kurt B. Ferreira, G. Shipman, W. Shu","doi":"10.1145/2834800.2834801","DOIUrl":"https://doi.org/10.1145/2834800.2834801","url":null,"abstract":"Power consumption of extreme-scale supercomputers has become a key performance bottleneck. Yet current practices do not leverage power management opportunities, instead running at \"maximum power\". This is not sustainable. Future systems will need to manage power as a critical resource, directing it to where it has greatest benefit. Power capping is one mechanism for managing power budgets, however its behavior is not well understood. This paper presents an empirical evaluation of several key HPC workloads running under a power cap on a Cray XC40 system, and provides a comparison of this technique with p-state control, demonstrating the performance differences of each. These results show: 1.) Maximum performance requires ensuring the cap is not reached; 2.) Performance slowdown under a cap can be attributed to cascading delays which result in unsynchronized performance variability across nodes; and, 3.) Due to lag in reaction time, considerable time is spent operating above the set cap. This work provides a timely and much needed comparison of HPC application performance under a power cap and attempts to enable users and system administrators to understand how to best optimize application performance on power-constrained HPC systems.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131175489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Towards an application-specific thermal energy model of current processors 面向当前处理器特定应用的热能模型

International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834805

V. Getov, D. Kerbyson, M. Macduff, A. Hoisie

引用次数: 11

Towards the development of hierarchical data motion power cost models 面向分层数据运动功耗模型的发展

International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834804

T. Mintz, Oluwatosin O. Alabi

{"title":"Towards the development of hierarchical data motion power cost models","authors":"T. Mintz, Oluwatosin O. Alabi","doi":"10.1145/2834800.2834804","DOIUrl":"https://doi.org/10.1145/2834800.2834804","url":null,"abstract":"Data intensive applications comprise a considerable portion of HPC center workloads. Whether large amounts of data transfer occur before, during or after an application is executed, this cost must be considered. Not just in terms of performance (e.g. time to completion), but also in terms of power consumed to complete these necessary tasks. At the system level, scheduling and resource management tools are capable of recording performance metrics and other constraints, and making performance aware decisions. These tools are a natural choice for making power aware decisions, as well. More specifically, power aware decisions about data transfer costs for the entire application workflow. This research focuses on developing data motion power cost models and integrating these models into a task scheduler framework to enable complete power aware scheduling of an entire HPC workflow. We have taken an incremental approach to developing a hierarchical, system wide power model for data motion that starts with core data motion and will eventually encompass data motion across facilities. In this paper, we discuss our current research which addresses multicore data motion and data motion between nodes.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127593161","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measurement and characterization of Haswell power and energy consumption Haswell功率和能耗的测量和表征

International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834807

Song Huang, M. Lang, S. Pakin, Song Fu

{"title":"Measurement and characterization of Haswell power and energy consumption","authors":"Song Huang, M. Lang, S. Pakin, Song Fu","doi":"10.1145/2834800.2834807","DOIUrl":"https://doi.org/10.1145/2834800.2834807","url":null,"abstract":"The recently introduced Intel Haswell processors implement major changes compared to their predecessors, especially with respect to power management. Haswell processors are used in the new-generation DOE NNSA tri-lab supercomputer, Trinity, hosted at Los Alamos National Laboratory. In this paper we measure and analyze a number of power-based parameter of Haswell that are of great importance for the energy consumption of applications. We study three HPC benchmarks, HPL, STREAM, FIRESTARTER and a hydrodynamics application, CLAMR. They are representative of workloads stressing different components of computers. Our experimental results show that real-time on-board power monitoring causes substantial power use if no optimization is performed; adapting P-states provides a cost-effective way to improve the power-performance of applications; enabling hyperthreading can significantly save energy by up to 96.3% for compute-bound applications; HPC applications should employ differentiated core affinity strategies in order to achieve the maximum power-performance. Moreover, we study the imbalance of sockets on a server in their power and energy use, and then propose approaches to mitigate such imbalance.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128152934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Compute bottlenecks on the new 64-bit ARM 新的64位ARM上的计算瓶颈

International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834806

Adam Jundt, Allyson Cauble-Chantrenne, Ananta Tiwari, Joshua Peraza, M. Laurenzano, L. Carrington

引用次数: 14

Experimental design and comparative testing of a hybrid-cooled computer cluster 混合冷却计算机集群的实验设计与对比测试

International Workshop on Energy Efficient Supercomputing Pub Date : 2015-11-15 DOI: 10.1145/2834800.2834803

A. Bonnie

引用次数: 1

Toward application-specific memory reconfiguration for energy efficiency 面向特定应用程序的内存重新配置，以提高能效

International Workshop on Energy Efficient Supercomputing Pub Date : 2013-11-17 DOI: 10.1145/2536430.2536434

Pietro Cicotti, L. Carrington, A. Chien

{"title":"Toward application-specific memory reconfiguration for energy efficiency","authors":"Pietro Cicotti, L. Carrington, A. Chien","doi":"10.1145/2536430.2536434","DOIUrl":"https://doi.org/10.1145/2536430.2536434","url":null,"abstract":"The end of Dennard scaling has made energy-efficiency a critical challenge in the continued increase of computing performance. An important approach to increasing energy-efficiency is hardware customization. In this study we explore the opportunity for energy-efficiency via memory hierarchy customization and present a methodology to identify application-specific energy efficient configurations. Using a workload of 37 diverse benchmarks, we address three key questions: 1) How much energy saving is possible?, 2) How much reconfiguration is required?, and 3) Can we use application characterization to automatically select an energy-optimal memory hierarchy configuration? Our results show that the potential benefit is large -- average reductions close to 70% in memory hierarchy energy with no performance loss. Further, our results show that the number of configurations need not be large; 13 carefully chosen configurations can deliver 93% of this benefit (64% energy reduction), and even coarse-grain reconfigurations of an existing hierarchy can deliver 81% of this benefit (56% energy reduction), suggesting that reconfigurable hierarchies may be practically realizable. Finally, as a first step towards automatic reconfiguration, we explore application characterization via reuse distance as a guide to select the best memory hierarchy configuration; we show that reuse distance can effectively predict the application-specific configuration which will both maintain performance and deliver energy efficiency.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125521993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Unified performance and power modeling of scientific workloads 科学工作负载的统一性能和功率建模

International Workshop on Energy Efficient Supercomputing Pub Date : 2013-11-17 DOI: 10.1145/2536430.2536435

S. Song, K. Barker, D. Kerbyson

{"title":"Unified performance and power modeling of scientific workloads","authors":"S. Song, K. Barker, D. Kerbyson","doi":"10.1145/2536430.2536435","DOIUrl":"https://doi.org/10.1145/2536430.2536435","url":null,"abstract":"It is expected that scientific applications executing on future large-scale HPC must be optimized not only in terms of performance, but also in terms of power consumption. As power and energy become increasingly constrained resources, researchers and developers must have access to tools that will allow for accurate prediction of both performance and power consumption. Reasoning about performance and power consumption in concert will be critical for achieving maximum utilization of limited resources on future HPC systems. To this end, we present a unified performance and power model for the Nek-Bone mini-application developed as part of the DOE's CESAR Exascale Co-Design Center. Our models consider the impact of computation, point-to-point communication, and collective communication individually and quantitatively predict their impact on both performance and energy efficiency. Further, these models are demonstrated to be accurate on currently available HPC system architectures. In this paper, we present our modeling methodology and performance and power models for the Nek-Bone mini-application. We present validation results that indicate the accuracy of these models.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123868728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

Leakage energy estimates for HPC applications 高性能计算应用的泄漏能量估算

International Workshop on Energy Efficient Supercomputing Pub Date : 2013-11-17 DOI: 10.1145/2536430.2536431

Aditya M. Deshpande, J. Draper

{"title":"Leakage energy estimates for HPC applications","authors":"Aditya M. Deshpande, J. Draper","doi":"10.1145/2536430.2536431","DOIUrl":"https://doi.org/10.1145/2536430.2536431","url":null,"abstract":"Large-scale high-performance systems are energy constrained. With thousands of processing cores at their disposal, these machines contain large amounts of on-chip caches. With a trend of decreasing thresholds in transistors, the amount of leakage current and energy losses has increased dramatically. Coupling the two trends, on-chip caches are responsible for a large portion of total leakage energy losses. In this work, we quantify the on-chip leakage energy losses across a wide set of applications. Our scheme profiles applications to measure cache accesses in order to estimate energy consumption across various levels of caches. Our study indicates that the leakage energy is the dominant form of energy dissipation in on-chip caches and may account for up to 80% of total cache energy, and this trend is expected to increase with every new generation of semiconductor process. Our results also suggest that compiler optimizations have a very limited effect on the total energy consumption of the caches and irrespective of the compiler optimizations, the problem of leakage in caches cannot be effectively addressed by software techniques but requires intervention at circuit and architectural levels. The problem of leakage in caches cannot be neglected in attacking the energy barrier to building exascale systems.","PeriodicalId":285336,"journal":{"name":"International Workshop on Energy Efficient Supercomputing","volume":"68 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-11-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129436763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6