{"title":"Generating High Performance GPU Code using Rewrite Rules with Lift","authors":"Christophe Dubach","doi":"10.1145/3180270.3182628","DOIUrl":null,"url":null,"abstract":"Graphic processors (GPUs) are the cornerstone of modern heterogeneous systems. GPUs exhibit tremendous computational power but are notoriously hard to program. High-level programming languages and domainspecific languages have been proposed to address this issue. However, they often rely on complex analysis in the compiler or device-specific implementations to achieve maximum performance. This means that compilers and software implementations need to be re-written and re-tuned continuously as new hardware emerge. In this talk, I will present Lift, a novel high-level data-parallel programming model. The language is based on a surprisingly small set of functional primitives which can be combined to define higher-level hardwareagnostic algorithmic patterns. A system of rewrite-rules is used to derive device-specific optimised low-level implementations of the algorithmic patterns. The rules encode both algorithmic choices and low-level optimisations in a unified system and let the compiler explore the optimisation space automatically. Our results show that the generated code matches the performance of highly tuned implementations of several computational kernels from linear algebra and stencil domain across various classes of GPUs.","PeriodicalId":274320,"journal":{"name":"Proceedings of the 11th Workshop on General Purpose GPUs","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 11th Workshop on General Purpose GPUs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3180270.3182628","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Graphic processors (GPUs) are the cornerstone of modern heterogeneous systems. GPUs exhibit tremendous computational power but are notoriously hard to program. High-level programming languages and domainspecific languages have been proposed to address this issue. However, they often rely on complex analysis in the compiler or device-specific implementations to achieve maximum performance. This means that compilers and software implementations need to be re-written and re-tuned continuously as new hardware emerge. In this talk, I will present Lift, a novel high-level data-parallel programming model. The language is based on a surprisingly small set of functional primitives which can be combined to define higher-level hardwareagnostic algorithmic patterns. A system of rewrite-rules is used to derive device-specific optimised low-level implementations of the algorithmic patterns. The rules encode both algorithmic choices and low-level optimisations in a unified system and let the compiler explore the optimisation space automatically. Our results show that the generated code matches the performance of highly tuned implementations of several computational kernels from linear algebra and stencil domain across various classes of GPUs.