用于硬件加速器生产性编程的外编译

Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation Pub Date : 2022-06-09 DOI:10.1145/3519939.3523446

Yuka Ikarashi, G. Bernstein, Alex Reinking, Hasan Genç, Jonathan Ragan-Kelley

{"title":"用于硬件加速器生产性编程的外编译","authors":"Yuka Ikarashi, G. Bernstein, Alex Reinking, Hasan Genç, Jonathan Ragan-Kelley","doi":"10.1145/3519939.3523446","DOIUrl":null,"url":null,"abstract":"High-performance kernel libraries are critical to exploiting accelerators and specialized instructions in many applications. Because compilers are difficult to extend to support diverse and rapidly-evolving hardware targets, and automatic optimization is often insufficient to guarantee state-of-the-art performance, these libraries are commonly still coded and optimized by hand, at great expense, in low-level C and assembly. To better support development of high-performance libraries for specialized hardware, we propose a new programming language, Exo, based on the principle of exocompilation: externalizing target-specific code generation support and optimization policies to user-level code. Exo allows custom hardware instructions, specialized memories, and accelerator configuration state to be defined in user libraries. It builds on the idea of user scheduling to externalize hardware mapping and optimization decisions. Schedules are defined as composable rewrites within the language, and we develop a set of effect analyses which guarantee program equivalence and memory safety through these transformations. We show that Exo enables rapid development of state-of-the-art matrix-matrix multiply and convolutional neural network kernels, for both an embedded neural accelerator and x86 with AVX-512 extensions, in a few dozen lines of code each.","PeriodicalId":140942,"journal":{"name":"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation","volume":"13 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-06-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Exocompilation for productive programming of hardware accelerators\",\"authors\":\"Yuka Ikarashi, G. Bernstein, Alex Reinking, Hasan Genç, Jonathan Ragan-Kelley\",\"doi\":\"10.1145/3519939.3523446\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High-performance kernel libraries are critical to exploiting accelerators and specialized instructions in many applications. Because compilers are difficult to extend to support diverse and rapidly-evolving hardware targets, and automatic optimization is often insufficient to guarantee state-of-the-art performance, these libraries are commonly still coded and optimized by hand, at great expense, in low-level C and assembly. To better support development of high-performance libraries for specialized hardware, we propose a new programming language, Exo, based on the principle of exocompilation: externalizing target-specific code generation support and optimization policies to user-level code. Exo allows custom hardware instructions, specialized memories, and accelerator configuration state to be defined in user libraries. It builds on the idea of user scheduling to externalize hardware mapping and optimization decisions. Schedules are defined as composable rewrites within the language, and we develop a set of effect analyses which guarantee program equivalence and memory safety through these transformations. We show that Exo enables rapid development of state-of-the-art matrix-matrix multiply and convolutional neural network kernels, for both an embedded neural accelerator and x86 with AVX-512 extensions, in a few dozen lines of code each.\",\"PeriodicalId\":140942,\"journal\":{\"name\":\"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation\",\"volume\":\"13 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-06-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3519939.3523446\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3519939.3523446","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 13

摘要

高性能内核库对于在许多应用程序中利用加速器和专用指令至关重要。因为编译器很难扩展以支持各种快速发展的硬件目标，而且自动优化通常不足以保证最先进的性能，所以这些库通常仍然是手工编码和优化的，花费很大，使用低级C语言和汇编语言。为了更好地支持针对特定硬件的高性能库的开发，我们提出了一种基于外编译原理的新编程语言Exo:将特定目标的代码生成支持和优化策略外部化到用户级代码。Exo允许在用户库中定义自定义硬件指令、专用内存和加速器配置状态。它建立在用户调度的思想之上，将硬件映射和优化决策具体化。时间表被定义为语言中的可组合重写，我们开发了一套效果分析，通过这些转换保证程序等价和内存安全。我们展示了Exo能够快速开发最先进的矩阵-矩阵乘法和卷积神经网络内核，用于嵌入式神经加速器和带有AVX-512扩展的x86，每个只需要几十行代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Exocompilation for productive programming of hardware accelerators

High-performance kernel libraries are critical to exploiting accelerators and specialized instructions in many applications. Because compilers are difficult to extend to support diverse and rapidly-evolving hardware targets, and automatic optimization is often insufficient to guarantee state-of-the-art performance, these libraries are commonly still coded and optimized by hand, at great expense, in low-level C and assembly. To better support development of high-performance libraries for specialized hardware, we propose a new programming language, Exo, based on the principle of exocompilation: externalizing target-specific code generation support and optimization policies to user-level code. Exo allows custom hardware instructions, specialized memories, and accelerator configuration state to be defined in user libraries. It builds on the idea of user scheduling to externalize hardware mapping and optimization decisions. Schedules are defined as composable rewrites within the language, and we develop a set of effect analyses which guarantee program equivalence and memory safety through these transformations. We show that Exo enables rapid development of state-of-the-art matrix-matrix multiply and convolutional neural network kernels, for both an embedded neural accelerator and x86 with AVX-512 extensions, in a few dozen lines of code each.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 43rd ACM SIGPLAN International Conference on Programming Language Design and Implementation

自引率

0.00%

发文量