{"title":"Effects of source-code optimizations on GPU performance and energy consumption","authors":"Jared Coplin, Martin Burtscher","doi":"10.1145/2716282.2716292","DOIUrl":null,"url":null,"abstract":"This paper studies the effects of source-code optimizations on the performance, power draw, and energy consumption of a modern compute GPU. We evaluate 128 versions of two n-body codes: a compute-bound regular implementation and a memory-bound irregular implementation. Both programs include six optimizations that can be individually enabled or disabled. We measured the active runtime and the power consumption of each code version on three inputs, various GPU clock frequencies, two arithmetic precisions, and with and without ECC. This paper investigates which optimizations primarily improve energy efficiency, which ones mainly boost performance, and which ones help both aspects. Some optimizations also have the added benefit of reducing the power draw. Our analysis shows that individual and combinations of optimizations can alter the performance and energy consumption of a GPU kernel by up to a factor of five.","PeriodicalId":432610,"journal":{"name":"Proceedings of the 8th Workshop on General Purpose Processing using GPUs","volume":"62 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th Workshop on General Purpose Processing using GPUs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2716282.2716292","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
This paper studies the effects of source-code optimizations on the performance, power draw, and energy consumption of a modern compute GPU. We evaluate 128 versions of two n-body codes: a compute-bound regular implementation and a memory-bound irregular implementation. Both programs include six optimizations that can be individually enabled or disabled. We measured the active runtime and the power consumption of each code version on three inputs, various GPU clock frequencies, two arithmetic precisions, and with and without ECC. This paper investigates which optimizations primarily improve energy efficiency, which ones mainly boost performance, and which ones help both aspects. Some optimizations also have the added benefit of reducing the power draw. Our analysis shows that individual and combinations of optimizations can alter the performance and energy consumption of a GPU kernel by up to a factor of five.