Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing

Canqun Yang, Feng Wang, Yunfei Du, Juan Chen, Jie Liu, Huizhan Yi, Kai Lu
{"title":"Adaptive Optimization for Petascale Heterogeneous CPU/GPU Computing","authors":"Canqun Yang, Feng Wang, Yunfei Du, Juan Chen, Jie Liu, Huizhan Yi, Kai Lu","doi":"10.1109/CLUSTER.2010.12","DOIUrl":null,"url":null,"abstract":"In this paper, we describe our experiment developing an implementation of the Linpack benchmark for TianHe-1, a petascale CPU/GPU supercomputer system, the largest GPU-accelerated system ever attempted before. An adaptive optimization framework is presented to balance the workload distribution across the GPUs and CPUs with the negligible runtime overhead, resulting in the better performance than the static or the training partitioning methods. The CPU-GPU communication overhead is effectively hidden by a software pipelining technique, which is particularly useful for large memory-bound applications. Combined with other traditional optimizations, the Linpack we optimized using the adaptive optimization framework achieved 196.7 GFLOPS on a single compute element of TianHe-1. This result is 70.1% of the peak compute capability and 3.3 times faster than the result using the vendor’s library. On the full configuration of TianHe-1 our optimizations resulted in a Linpack performance of 0.563PFLOPS, which made TianHe-1 the 5th fastest supercomputer on the Top500 list released in November 2009.","PeriodicalId":152171,"journal":{"name":"2010 IEEE International Conference on Cluster Computing","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"83","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 IEEE International Conference on Cluster Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CLUSTER.2010.12","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 83

Abstract

In this paper, we describe our experiment developing an implementation of the Linpack benchmark for TianHe-1, a petascale CPU/GPU supercomputer system, the largest GPU-accelerated system ever attempted before. An adaptive optimization framework is presented to balance the workload distribution across the GPUs and CPUs with the negligible runtime overhead, resulting in the better performance than the static or the training partitioning methods. The CPU-GPU communication overhead is effectively hidden by a software pipelining technique, which is particularly useful for large memory-bound applications. Combined with other traditional optimizations, the Linpack we optimized using the adaptive optimization framework achieved 196.7 GFLOPS on a single compute element of TianHe-1. This result is 70.1% of the peak compute capability and 3.3 times faster than the result using the vendor’s library. On the full configuration of TianHe-1 our optimizations resulted in a Linpack performance of 0.563PFLOPS, which made TianHe-1 the 5th fastest supercomputer on the Top500 list released in November 2009.
千万亿异构CPU/GPU计算的自适应优化
在本文中,我们描述了我们在天河一号上开发Linpack基准测试的实验,天河一号是一个千万亿次CPU/GPU超级计算机系统,是迄今为止尝试过的最大的GPU加速系统。提出了一种自适应优化框架,在运行时开销可忽略的情况下平衡gpu和cpu之间的工作负载分布,从而获得比静态或训练分区方法更好的性能。CPU-GPU通信开销通过软件流水线技术有效地隐藏,这对于内存受限的大型应用程序特别有用。结合其他传统优化,我们使用自适应优化框架优化的Linpack在天河一号的单个计算单元上实现了196.7 GFLOPS。这个结果是峰值计算能力的70.1%,比使用供应商库的结果快3.3倍。在天河一号的完整配置下,我们的优化结果使Linpack性能达到0.563PFLOPS,这使得天河一号在2009年11月发布的世界500强中排名第五。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信