Simulating HEP Workflows on Heterogeneous Architectures

2018 IEEE 14th International Conference on e-Science (e-Science) Pub Date : 2018-10-01 DOI:10.1109/eScience.2018.00087

C. Leggett, I. Shapoval

{"title":"Simulating HEP Workflows on Heterogeneous Architectures","authors":"C. Leggett, I. Shapoval","doi":"10.1109/eScience.2018.00087","DOIUrl":null,"url":null,"abstract":"The next generation of supercomputing facilities, such as Oak Ridge's Summit and Lawrence Livermore's Sierra, show an increasing use of GPGPUs and other accelerators in order to achieve their high FLOP counts. This trend will only grow with exascale facilities. In general, High Energy Physics computing workflows have made little use of GPUs due to the relatively small fraction of kernels that run efficiently on GPUs, and the expense of rewriting code for rapidly evolving GPU hardware. However, the computing requirements for high-luminosity LHC are enormous, and it will become essential to be able to make use of supercomputing facilities that rely heavily on GPUs and other accelerator technologies. ATLAS has already developed an extension to AthenaMT, its multithreaded event processing framework, that enables the non-intrusive offloading of computations to external accelerator resources, and is developing strategies to schedule the offloading efficiently. Before investing heavily in writing many kernels, we need to better understand the performance metrics and throughput bounds of the workflows with various accelerator configurations. This can be done by simulating the workflows, using real metrics for task interdependencies and timing, as we vary fractions of offloaded tasks, latencies, data conversion speeds, memory bandwidths, and accelerator offloading parameters such as CPU/GPU ratios and speeds. We present the results of these studies, which will be instrumental in directing effort to make the ATLAS framework, kernels and workflows run efficiently on exascale facilities.","PeriodicalId":6476,"journal":{"name":"2018 IEEE 14th International Conference on e-Science (e-Science)","volume":"6 1","pages":"343-343"},"PeriodicalIF":0.0000,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 14th International Conference on e-Science (e-Science)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/eScience.2018.00087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

The next generation of supercomputing facilities, such as Oak Ridge's Summit and Lawrence Livermore's Sierra, show an increasing use of GPGPUs and other accelerators in order to achieve their high FLOP counts. This trend will only grow with exascale facilities. In general, High Energy Physics computing workflows have made little use of GPUs due to the relatively small fraction of kernels that run efficiently on GPUs, and the expense of rewriting code for rapidly evolving GPU hardware. However, the computing requirements for high-luminosity LHC are enormous, and it will become essential to be able to make use of supercomputing facilities that rely heavily on GPUs and other accelerator technologies. ATLAS has already developed an extension to AthenaMT, its multithreaded event processing framework, that enables the non-intrusive offloading of computations to external accelerator resources, and is developing strategies to schedule the offloading efficiently. Before investing heavily in writing many kernels, we need to better understand the performance metrics and throughput bounds of the workflows with various accelerator configurations. This can be done by simulating the workflows, using real metrics for task interdependencies and timing, as we vary fractions of offloaded tasks, latencies, data conversion speeds, memory bandwidths, and accelerator offloading parameters such as CPU/GPU ratios and speeds. We present the results of these studies, which will be instrumental in directing effort to make the ATLAS framework, kernels and workflows run efficiently on exascale facilities.

查看原文本刊更多论文

异构架构上的HEP工作流模拟

下一代超级计算设备，如橡树岭的Summit和Lawrence Livermore的Sierra，显示出越来越多地使用gpgpu和其他加速器来实现高FLOP计数。这种趋势只会随着百亿亿次设施的发展而增长。一般来说，高能物理计算工作流很少使用GPU，因为在GPU上有效运行的内核相对较少，并且为快速发展的GPU硬件重写代码的费用很高。然而，高亮度LHC的计算需求是巨大的，能够利用严重依赖gpu和其他加速器技术的超级计算设施将变得至关重要。ATLAS已经开发了AthenaMT的扩展，AthenaMT是其多线程事件处理框架，可以将非侵入性的计算卸载到外部加速器资源，并且正在开发有效调度卸载的策略。在大量投入编写许多内核之前，我们需要更好地理解使用各种加速器配置的工作流的性能指标和吞吐量界限。这可以通过模拟工作流来实现，使用任务相互依赖性和时间的真实指标，因为我们可以改变卸载任务的部分、延迟、数据转换速度、内存带宽和加速器卸载参数(如CPU/GPU比率和速度)。我们介绍了这些研究的结果，这将有助于指导使ATLAS框架，内核和工作流程在百亿亿次设施上有效运行的工作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 14th International Conference on e-Science (e-Science)

自引率

0.00%

发文量