Assembly micro-benchmark generator for characterizing Floating Point Units

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI:10.1109/HPCS48598.2019.9188209

Jean Pourroy, P. Demichel, C. Denis

{"title":"Assembly micro-benchmark generator for characterizing Floating Point Units","authors":"Jean Pourroy, P. Demichel, C. Denis","doi":"10.1109/HPCS48598.2019.9188209","DOIUrl":null,"url":null,"abstract":"Making the right platform choice has always been a challenge for the HPC users no matter the applications vertical they are in. The number of references is very large and making the wrong choice can have adverse effects. Formerly users only had to choose between, for example, the different processors and interconnect vendors. Lately, due to the new Intel Skylake processors the choice has become increasingly difficult as different levels of performance are available within the same vendor platforms. To facilitate selection and give possible directions for the real benchmarked applications we introduce the Kernel Generator, an open source tool generating assembly kernels to help the programmer or the benchmarker understand the behavior of the different micro-architectures. We used our tool to study the behavior of the current micro-architectures and compare it to the current synthetic benchmarks which sometimes are not correctly characterizing a platform nor expose its strengths. The Kernel Generator facilitates the discovery of the platforms performance fit. To insure the relevance of our kernel, we are looking at Ansys Fluent behavior to explain the performance on the different Intel processors. In this case, we have that 4100 and 6100 Intel processors families can have equivalent performance on codes not well vectorized: Fluent being one of them. This demonstrates that we can use our tool for initial profiling and understanding of the different platforms.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS48598.2019.9188209","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Making the right platform choice has always been a challenge for the HPC users no matter the applications vertical they are in. The number of references is very large and making the wrong choice can have adverse effects. Formerly users only had to choose between, for example, the different processors and interconnect vendors. Lately, due to the new Intel Skylake processors the choice has become increasingly difficult as different levels of performance are available within the same vendor platforms. To facilitate selection and give possible directions for the real benchmarked applications we introduce the Kernel Generator, an open source tool generating assembly kernels to help the programmer or the benchmarker understand the behavior of the different micro-architectures. We used our tool to study the behavior of the current micro-architectures and compare it to the current synthetic benchmarks which sometimes are not correctly characterizing a platform nor expose its strengths. The Kernel Generator facilitates the discovery of the platforms performance fit. To insure the relevance of our kernel, we are looking at Ansys Fluent behavior to explain the performance on the different Intel processors. In this case, we have that 4100 and 6100 Intel processors families can have equivalent performance on codes not well vectorized: Fluent being one of them. This demonstrates that we can use our tool for initial profiling and understanding of the different platforms.

查看原文本刊更多论文

用于表征浮点单元的装配微基准生成器

对于HPC用户来说，选择正确的平台一直是一个挑战，无论他们是在哪个垂直应用程序中。参考文献的数量非常大，做出错误的选择可能会产生不利影响。以前，用户只需要在不同的处理器和互连供应商之间进行选择。最近，由于新的英特尔Skylake处理器，选择变得越来越困难，因为在相同的供应商平台上可以获得不同的性能水平。为了便于选择并为真正的基准测试应用程序提供可能的指导，我们介绍了内核生成器，这是一个生成汇编内核的开源工具，可以帮助程序员或基准测试人员理解不同微体系结构的行为。我们使用我们的工具来研究当前微架构的行为，并将其与当前的合成基准进行比较，这些基准有时不能正确地表征平台，也不能暴露其优势。内核生成器有助于发现平台的性能匹配。为了确保我们的内核的相关性，我们正在研究Ansys Fluent的行为来解释不同英特尔处理器上的性能。在这种情况下，我们有4100和6100英特尔处理器家族可以在没有很好地向量化的代码上具有相同的性能:Fluent就是其中之一。这表明我们可以使用我们的工具进行初始分析并了解不同的平台。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on High Performance Computing & Simulation (HPCS)

自引率

0.00%

发文量