Forma:针对gpu和多核cpu的图像处理应用程序的DSL

Mahesh Ravishankar, Justin Holewinski, Vinod Grover
{"title":"Forma:针对gpu和多核cpu的图像处理应用程序的DSL","authors":"Mahesh Ravishankar, Justin Holewinski, Vinod Grover","doi":"10.1145/2716282.2716290","DOIUrl":null,"url":null,"abstract":"As architectures evolve, optimization techniques to obtain good performance evolve as well. Using low-level programming languages like C/C++ typically results in architecture-specific optimization techniques getting entangled with the application specification. In such situations, moving from one target architecture to another usually requires a reimplementation of the entire application. Further, several compiler transformations are rendered ineffective due to implementation choices. Domain-Specific Languages (DSL) tackle both these issues by allowing developers to specify the computation at a high level, allowing the compiler to handle many tedious and error-prone tasks, while generating efficient code for multiple target architectures at the same time. Here we present Forma, a DSL for image processing applications that targets both CPUs and GPUs. The language provides syntax to express several operations like stencils, sampling, etc. which are commonly used in this domain. These can be chained together to specify complex pipelines in a concise manner. The Forma compiler is in charge of tedious tasks like memory management, data transfers from host to device, handling boundary conditions, etc. The high-level description allows the compiler to generate efficient code through use of compile-time analysis and by taking advantage of hardware resources, like texture memory on GPUs. The ease with which complex pipelines can be specified in Forma is demonstrated through several examples. The efficiency of the generated code is evaluated through comparison with a state-of-the-art DSL that targets the same domain, Halide. Our experimental result show that using Forma allows developers to obtain comparable performance on both CPU and GPU with lesser programmer effort. We also show how Forma could be easily integrated with widely used productivity tools like Python and OpenCV. Such an integration would allow users of such tools to develop efficient implementations easily.","PeriodicalId":432610,"journal":{"name":"Proceedings of the 8th Workshop on General Purpose Processing using GPUs","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-02-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":"{\"title\":\"Forma: a DSL for image processing applications to target GPUs and multi-core CPUs\",\"authors\":\"Mahesh Ravishankar, Justin Holewinski, Vinod Grover\",\"doi\":\"10.1145/2716282.2716290\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"As architectures evolve, optimization techniques to obtain good performance evolve as well. Using low-level programming languages like C/C++ typically results in architecture-specific optimization techniques getting entangled with the application specification. In such situations, moving from one target architecture to another usually requires a reimplementation of the entire application. Further, several compiler transformations are rendered ineffective due to implementation choices. Domain-Specific Languages (DSL) tackle both these issues by allowing developers to specify the computation at a high level, allowing the compiler to handle many tedious and error-prone tasks, while generating efficient code for multiple target architectures at the same time. Here we present Forma, a DSL for image processing applications that targets both CPUs and GPUs. The language provides syntax to express several operations like stencils, sampling, etc. which are commonly used in this domain. These can be chained together to specify complex pipelines in a concise manner. The Forma compiler is in charge of tedious tasks like memory management, data transfers from host to device, handling boundary conditions, etc. The high-level description allows the compiler to generate efficient code through use of compile-time analysis and by taking advantage of hardware resources, like texture memory on GPUs. The ease with which complex pipelines can be specified in Forma is demonstrated through several examples. The efficiency of the generated code is evaluated through comparison with a state-of-the-art DSL that targets the same domain, Halide. Our experimental result show that using Forma allows developers to obtain comparable performance on both CPU and GPU with lesser programmer effort. We also show how Forma could be easily integrated with widely used productivity tools like Python and OpenCV. Such an integration would allow users of such tools to develop efficient implementations easily.\",\"PeriodicalId\":432610,\"journal\":{\"name\":\"Proceedings of the 8th Workshop on General Purpose Processing using GPUs\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-02-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"27\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 8th Workshop on General Purpose Processing using GPUs\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2716282.2716290\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 8th Workshop on General Purpose Processing using GPUs","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2716282.2716290","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

摘要

随着体系结构的发展,获得良好性能的优化技术也在发展。使用像C/ c++这样的低级编程语言通常会导致特定于体系结构的优化技术与应用程序规范纠缠在一起。在这种情况下,从一个目标体系结构转移到另一个目标体系结构通常需要重新实现整个应用程序。此外,由于实现选择,一些编译器转换会变得无效。领域特定语言(DSL)解决了这两个问题,它允许开发人员在高层次上指定计算,允许编译器处理许多繁琐且容易出错的任务,同时为多个目标体系结构生成高效的代码。这里我们介绍Forma,一种针对cpu和gpu的图像处理应用程序的DSL。该语言提供了语法来表达该领域中常用的几种操作,如模板、采样等。它们可以链接在一起,以简洁的方式指定复杂的管道。Forma编译器负责繁琐的任务,如内存管理、从主机到设备的数据传输、处理边界条件等。高级描述允许编译器通过使用编译时分析和利用硬件资源(如gpu上的纹理内存)来生成高效的代码。通过几个示例演示了在Forma中指定复杂管道的便利性。通过与针对同一领域的最先进的DSL (Halide)进行比较来评估所生成代码的效率。我们的实验结果表明,使用Forma可以让开发人员在CPU和GPU上获得相当的性能,而程序员的工作量更少。我们还展示了如何将Forma与广泛使用的生产力工具(如Python和OpenCV)轻松集成。这样的集成将允许这些工具的用户轻松地开发有效的实现。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Forma: a DSL for image processing applications to target GPUs and multi-core CPUs
As architectures evolve, optimization techniques to obtain good performance evolve as well. Using low-level programming languages like C/C++ typically results in architecture-specific optimization techniques getting entangled with the application specification. In such situations, moving from one target architecture to another usually requires a reimplementation of the entire application. Further, several compiler transformations are rendered ineffective due to implementation choices. Domain-Specific Languages (DSL) tackle both these issues by allowing developers to specify the computation at a high level, allowing the compiler to handle many tedious and error-prone tasks, while generating efficient code for multiple target architectures at the same time. Here we present Forma, a DSL for image processing applications that targets both CPUs and GPUs. The language provides syntax to express several operations like stencils, sampling, etc. which are commonly used in this domain. These can be chained together to specify complex pipelines in a concise manner. The Forma compiler is in charge of tedious tasks like memory management, data transfers from host to device, handling boundary conditions, etc. The high-level description allows the compiler to generate efficient code through use of compile-time analysis and by taking advantage of hardware resources, like texture memory on GPUs. The ease with which complex pipelines can be specified in Forma is demonstrated through several examples. The efficiency of the generated code is evaluated through comparison with a state-of-the-art DSL that targets the same domain, Halide. Our experimental result show that using Forma allows developers to obtain comparable performance on both CPU and GPU with lesser programmer effort. We also show how Forma could be easily integrated with widely used productivity tools like Python and OpenCV. Such an integration would allow users of such tools to develop efficient implementations easily.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信