{"title":"Microbenchmarks for GPU Characteristics: The Occupancy Roofline and the Pipeline Model","authors":"J. Lemeire, Jan G. Cornelis, Laurent Segers","doi":"10.1109/PDP.2016.120","DOIUrl":null,"url":null,"abstract":"In this paper we present microbenchmarks in OpenCL to measure the most important performance characteristics of GPUs. Microbenchmarks try to measure individual characteristics that influence the performance. First, performance, in operations or bytes per second, is measured with respect to the occupancy and as such provides an occupancy roofline curve. The curve shows at which occupancy level peak performance is reached. Second, when considering the cycles per instruction of each compute unit, we measure the two most important characteristics of an instruction: its issue and completion latency. This is based on modeling each compute unit as a pipeline for computations and a pipeline for the memory access. We also measure some specific characteristics: the influence of independent instructions within a kernel and thread divergence. We argue that these are the most important characteristics for understanding the performance and predicting performance. The results for several Nvidia and AMD GPUs are provided. A free java application containing the microbenchmarks is available on www.gpuperformance.org.","PeriodicalId":192273,"journal":{"name":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","volume":"162 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing (PDP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDP.2016.120","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 8
Abstract
In this paper we present microbenchmarks in OpenCL to measure the most important performance characteristics of GPUs. Microbenchmarks try to measure individual characteristics that influence the performance. First, performance, in operations or bytes per second, is measured with respect to the occupancy and as such provides an occupancy roofline curve. The curve shows at which occupancy level peak performance is reached. Second, when considering the cycles per instruction of each compute unit, we measure the two most important characteristics of an instruction: its issue and completion latency. This is based on modeling each compute unit as a pipeline for computations and a pipeline for the memory access. We also measure some specific characteristics: the influence of independent instructions within a kernel and thread divergence. We argue that these are the most important characteristics for understanding the performance and predicting performance. The results for several Nvidia and AMD GPUs are provided. A free java application containing the microbenchmarks is available on www.gpuperformance.org.