OpenCL内核中机器无关中间代码的栈大小估计

PARMA-DITAM '16 Pub Date : 2016-01-18 DOI:10.1145/2872421.2872425

Stefano Cherubin, M. Scandale, G. Agosta

{"title":"OpenCL内核中机器无关中间代码的栈大小估计","authors":"Stefano Cherubin, M. Scandale, G. Agosta","doi":"10.1145/2872421.2872425","DOIUrl":null,"url":null,"abstract":"Stack size is an important factor in the mapping decision when dealing with embedded heterogeneous architectures, where fast memory is a scarce resource. Trying to map a kernel onto a device with insufficient memory may lead to reduced performance or even failure to run the kernel. OpenCL kernels are often compiled just-in-time, starting from the source code or an intermediate machine-independent representation. Precise stack size information, however, is only available in machine-dependent code. We provide a method for computing the stack size with sufficient accuracy on machine-independent code, given knowledge of the target ABI and register file architecture. This method can be applied to make mapping decisions early, thus avoiding to compile multiple times the code for each possible accelerator in a complex embedded heterogeneous system.","PeriodicalId":115716,"journal":{"name":"PARMA-DITAM '16","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-01-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Stack size estimation on machine-independent intermediate code for OpenCL kernels\",\"authors\":\"Stefano Cherubin, M. Scandale, G. Agosta\",\"doi\":\"10.1145/2872421.2872425\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stack size is an important factor in the mapping decision when dealing with embedded heterogeneous architectures, where fast memory is a scarce resource. Trying to map a kernel onto a device with insufficient memory may lead to reduced performance or even failure to run the kernel. OpenCL kernels are often compiled just-in-time, starting from the source code or an intermediate machine-independent representation. Precise stack size information, however, is only available in machine-dependent code. We provide a method for computing the stack size with sufficient accuracy on machine-independent code, given knowledge of the target ABI and register file architecture. This method can be applied to make mapping decisions early, thus avoiding to compile multiple times the code for each possible accelerator in a complex embedded heterogeneous system.\",\"PeriodicalId\":115716,\"journal\":{\"name\":\"PARMA-DITAM '16\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-01-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"PARMA-DITAM '16\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2872421.2872425\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"PARMA-DITAM '16","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2872421.2872425","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

在处理嵌入式异构体系结构时，堆栈大小是映射决策中的一个重要因素，其中快速内存是一种稀缺资源。试图将内核映射到内存不足的设备上可能会导致性能下降，甚至导致内核无法运行。OpenCL内核通常是即时编译的，从源代码或与机器无关的中间表示开始。但是，精确的堆栈大小信息仅在依赖于机器的代码中可用。我们提供了一种方法，在给定目标ABI和寄存器文件体系结构的知识的情况下，在与机器无关的代码上以足够的精度计算堆栈大小。该方法可以应用于早期的映射决策，从而避免在复杂的嵌入式异构系统中为每个可能的加速器编译多次代码。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Stack size estimation on machine-independent intermediate code for OpenCL kernels

Stack size is an important factor in the mapping decision when dealing with embedded heterogeneous architectures, where fast memory is a scarce resource. Trying to map a kernel onto a device with insufficient memory may lead to reduced performance or even failure to run the kernel. OpenCL kernels are often compiled just-in-time, starting from the source code or an intermediate machine-independent representation. Precise stack size information, however, is only available in machine-dependent code. We provide a method for computing the stack size with sufficient accuracy on machine-independent code, given knowledge of the target ABI and register file architecture. This method can be applied to make mapping decisions early, thus avoiding to compile multiple times the code for each possible accelerator in a complex embedded heterogeneous system.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

PARMA-DITAM '16

自引率

0.00%

发文量