Cactus: Top-Down GPU-Compute Benchmarking using Real-Life Applications

2021 IEEE International Symposium on Workload Characterization (IISWC) Pub Date : 2021-11-01 DOI:10.1109/IISWC53511.2021.00026

Mahmood Naderan-Tahan, L. Eeckhout

{"title":"Cactus: Top-Down GPU-Compute Benchmarking using Real-Life Applications","authors":"Mahmood Naderan-Tahan, L. Eeckhout","doi":"10.1109/IISWC53511.2021.00026","DOIUrl":null,"url":null,"abstract":"Benchmarking is the de facto standard for evaluating hardware architectures in academia and industry. While several benchmark suites targeting different application domains have been developed for CPU processors over many decades, benchmarking GPU architectures is not as mature. Since the introduction of GPUs for general-purpose computing, the purpose has been to accelerate (a) specific part(s) of the code, called (a) kernel(s). The initial GPU-compute benchmark suites, which are still widely used today, hence consist of relatively simple workloads that are composed of one or few kernels with specific unambiguous execution characteristics. In contrast, we find that modern-day real-life GPU-compute applications are much more complex consisting of many more kernels with differing characteristics. A fundamental question can hence be raised: are current benchmark suites still representative for modern real-life applications? In this paper, we introduce Cactus, a collection of widely used real-life open-source GPU-compute applications. The aim of this work is to offer a new perspective on GPU-compute benchmarking: while existing benchmark suites are designed in a bottom-up fashion (i.e., starting from kernels that are likely to perform well on GPUs), we perform GPU-compute benchmarking in a top-down fashion, starting from complex real-life applications that are composed of multiple kernels. We characterize the Cactus benchmarks by quantifying their kernel execution time distribution, by analyzing the workloads using the roofline model, by performing a performance metrics correlation analysis, and by classifying their constituent kernels through multi-dimensional data analysis. The overall conclusion is that the Cactus workloads execute many more kernels, include more diverse and more complex execution behavior, and cover a broader range of the workload space compared to the prevalently used benchmark suites. We hence believe that Cactus is a useful complement to the existing GPU-compute benchmarking toolbox.","PeriodicalId":203713,"journal":{"name":"2021 IEEE International Symposium on Workload Characterization (IISWC)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE International Symposium on Workload Characterization (IISWC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IISWC53511.2021.00026","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

Benchmarking is the de facto standard for evaluating hardware architectures in academia and industry. While several benchmark suites targeting different application domains have been developed for CPU processors over many decades, benchmarking GPU architectures is not as mature. Since the introduction of GPUs for general-purpose computing, the purpose has been to accelerate (a) specific part(s) of the code, called (a) kernel(s). The initial GPU-compute benchmark suites, which are still widely used today, hence consist of relatively simple workloads that are composed of one or few kernels with specific unambiguous execution characteristics. In contrast, we find that modern-day real-life GPU-compute applications are much more complex consisting of many more kernels with differing characteristics. A fundamental question can hence be raised: are current benchmark suites still representative for modern real-life applications? In this paper, we introduce Cactus, a collection of widely used real-life open-source GPU-compute applications. The aim of this work is to offer a new perspective on GPU-compute benchmarking: while existing benchmark suites are designed in a bottom-up fashion (i.e., starting from kernels that are likely to perform well on GPUs), we perform GPU-compute benchmarking in a top-down fashion, starting from complex real-life applications that are composed of multiple kernels. We characterize the Cactus benchmarks by quantifying their kernel execution time distribution, by analyzing the workloads using the roofline model, by performing a performance metrics correlation analysis, and by classifying their constituent kernels through multi-dimensional data analysis. The overall conclusion is that the Cactus workloads execute many more kernels, include more diverse and more complex execution behavior, and cover a broader range of the workload space compared to the prevalently used benchmark suites. We hence believe that Cactus is a useful complement to the existing GPU-compute benchmarking toolbox.

查看原文本刊更多论文

Cactus:使用实际应用程序进行自上而下的gpu计算基准测试

基准测试是学术界和工业界评估硬件架构的事实上的标准。虽然针对不同应用领域的CPU处理器的基准测试套件已经开发了几十年，但GPU架构的基准测试并不成熟。自从引入用于通用计算的gpu以来，其目的一直是加速代码的特定部分(称为内核)。最初的gpu计算基准套件至今仍在广泛使用，因此由相对简单的工作负载组成，这些工作负载由一个或几个具有特定明确执行特征的内核组成。相比之下，我们发现现代现实生活中的gpu计算应用程序要复杂得多，它由许多具有不同特征的内核组成。因此可以提出一个基本问题:当前的基准套件是否仍然能够代表现代现实生活中的应用程序?在本文中，我们介绍Cactus，一个广泛使用的现实生活中的开源gpu计算应用程序集合。这项工作的目的是为gpu计算基准测试提供一个新的视角:虽然现有的基准测试套件是以自下而上的方式设计的(即，从可能在gpu上表现良好的内核开始)，但我们以自上而下的方式执行gpu计算基准测试，从由多个内核组成的复杂现实应用程序开始。我们通过量化其内核执行时间分布、使用rooline模型分析工作负载、执行性能指标相关性分析以及通过多维数据分析对其组成内核进行分类来描述Cactus基准。总的结论是，与常用的基准测试套件相比，Cactus工作负载执行更多的内核，包括更多样化和更复杂的执行行为，并覆盖更广泛的工作负载空间。因此，我们相信Cactus是对现有gpu计算基准测试工具箱的有用补充。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE International Symposium on Workload Characterization (IISWC)

自引率

0.00%

发文量