Jeffrey S. Young, M. G. Lopez, Mitchel D. Horton, Richard Glassbrook, J. Vetter
{"title":"Advanced Application Support for Improved GPU Utilization on Keeneland","authors":"Jeffrey S. Young, M. G. Lopez, Mitchel D. Horton, Richard Glassbrook, J. Vetter","doi":"10.1145/2616498.2616506","DOIUrl":null,"url":null,"abstract":"With the delivery of the Keeneland Full Scale (KFS) system in 2012, XSEDE gained a new, unique GPU computing resource that contains a large number of GPUs per node. In KFS, each node has three NVIDIA Fermi GPUs, for a total of 792 GPUs and a theoretical peak of 614.5 TFLOPS across 264 nodes. While this system provides the potential for extreme productivity, its unique architecture also requires that each user make full use of all the GPU resources on each allocated node to achieve the best performance. Previous publications [12] have demonstrated a tool that allows for tracking the GPU utilization of individual nodes and the system as a whole, and it has helped to pinpoint low GPU utilization numbers on KFS and its precursor KIDS.\n This work discusses experiences, strategies, and results that have been applied on the Keeneland Full Scale system to ensure that users are fully utilizing GPU resources and to improve the performance of their calculations while reducing Service Unit (SU) usage. In many cases, these strategies boil down to two factors: user education and code optimization for KFS's unique architecture. Three specific applications are discussed in this context from the molecular science, materials science, and chemistry domains, and recent application support results are used to illustrate how small interventions can greatly increase utilization on a month-to-month basis.","PeriodicalId":93364,"journal":{"name":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","volume":"20 1","pages":"6:1-6:6"},"PeriodicalIF":0.0000,"publicationDate":"2014-07-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of XSEDE16 : Diversity, Big Data, and Science at Scale : July 17-21, 2016, Intercontinental Miami Hotel, Miami, Florida, USA. Conference on Extreme Science and Engineering Discovery Environment (5th : 2016 : Miami, Fla.)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2616498.2616506","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
With the delivery of the Keeneland Full Scale (KFS) system in 2012, XSEDE gained a new, unique GPU computing resource that contains a large number of GPUs per node. In KFS, each node has three NVIDIA Fermi GPUs, for a total of 792 GPUs and a theoretical peak of 614.5 TFLOPS across 264 nodes. While this system provides the potential for extreme productivity, its unique architecture also requires that each user make full use of all the GPU resources on each allocated node to achieve the best performance. Previous publications [12] have demonstrated a tool that allows for tracking the GPU utilization of individual nodes and the system as a whole, and it has helped to pinpoint low GPU utilization numbers on KFS and its precursor KIDS.
This work discusses experiences, strategies, and results that have been applied on the Keeneland Full Scale system to ensure that users are fully utilizing GPU resources and to improve the performance of their calculations while reducing Service Unit (SU) usage. In many cases, these strategies boil down to two factors: user education and code optimization for KFS's unique architecture. Three specific applications are discussed in this context from the molecular science, materials science, and chemistry domains, and recent application support results are used to illustrate how small interventions can greatly increase utilization on a month-to-month basis.
随着2012年Keeneland Full Scale (KFS)系统的交付,XSEDE获得了一种新的,独特的GPU计算资源,每个节点包含大量GPU。在KFS中,每个节点有三个NVIDIA费米gpu,总共有792个gpu, 264个节点的理论峰值为614.5 TFLOPS。虽然该系统提供了极高生产力的潜力,但其独特的架构也要求每个用户充分利用每个分配节点上的所有GPU资源,以实现最佳性能。以前的出版物[12]已经展示了一种工具,可以跟踪单个节点和整个系统的GPU利用率,并且它有助于查明KFS及其前身KIDS上的低GPU利用率。本工作讨论了在Keeneland Full Scale系统上应用的经验、策略和结果,以确保用户充分利用GPU资源,并在减少服务单元(Service Unit, SU)使用的同时提高计算性能。在许多情况下,这些策略可以归结为两个因素:用户教育和针对KFS独特架构的代码优化。本文讨论了分子科学、材料科学和化学领域的三个具体应用,并使用了最近的应用支持结果来说明小的干预措施如何能够在逐月的基础上大大提高利用率。