{"title":"Linear Performance-Breakdown Model: A Framework for GPU kernel programs performance analysis","authors":"Mario Alberto Chapa Martell, Hiroyuki Sato","doi":"10.15803/IJNC.5.1_86","DOIUrl":null,"url":null,"abstract":"In this paper we describe our performance-breakdown model for GPU programs. GPUs are a popular choice as accelerator hardware due to their high performance, high availability and relatively low price. However, writing programs that are highly efficient represents a difficult and time consuming task for programmers because of the complexities of GPU architecture and the inherent difficulty of parallel programming. That is the reason why we propose the Linear Performance-Breakdown Model Framework as a tool to assist in the optimization process. We show that the model closely matches the behavior of the GPU by comparing the execution time obtained from experiments in two different types of GPU, an Accelerated Processing Unit (APU) and a GTX660, a discrete board. We also show performance-breakdown results obtained from applying the modeling strategy and how they indicate the time spent during the computation in each of the three Mayor Performance Factors that we define as processing time, global memory transfer time and shared memory transfer time.Â","PeriodicalId":270166,"journal":{"name":"Int. J. Netw. Comput.","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Netw. Comput.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.15803/IJNC.5.1_86","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
In this paper we describe our performance-breakdown model for GPU programs. GPUs are a popular choice as accelerator hardware due to their high performance, high availability and relatively low price. However, writing programs that are highly efficient represents a difficult and time consuming task for programmers because of the complexities of GPU architecture and the inherent difficulty of parallel programming. That is the reason why we propose the Linear Performance-Breakdown Model Framework as a tool to assist in the optimization process. We show that the model closely matches the behavior of the GPU by comparing the execution time obtained from experiments in two different types of GPU, an Accelerated Processing Unit (APU) and a GTX660, a discrete board. We also show performance-breakdown results obtained from applying the modeling strategy and how they indicate the time spent during the computation in each of the three Mayor Performance Factors that we define as processing time, global memory transfer time and shared memory transfer time.Â