Divergence Analysis with Affine Constraints

Diogo Sampaio, R. M. Souza, Caroline Collange, Fernando Magno Quintão Pereira
{"title":"Divergence Analysis with Affine Constraints","authors":"Diogo Sampaio, R. M. Souza, Caroline Collange, Fernando Magno Quintão Pereira","doi":"10.1109/SBAC-PAD.2012.22","DOIUrl":null,"url":null,"abstract":"The rising popularity of graphics processing units is bringing renewed interest in code optimization techniques for SIMD processors. Many of these optimizations rely on divergence analyses, which classify variables as uniform, if they have the same value on every thread, or divergent, if they might not. This paper introduces a new kind of divergence analysis, that is able to represent variables as affine functions of thread identifiers. We have implemented this analysis in Ocelot, an open source compiler, and use it to analyze a suite of 177 CUDA kernels from well-known benchmarks. We can mark about one fourth of all program variables as affine functions of thread identifiers. In addition to the novel divergence analysis, we also introduce the notion of a divergence aware register allocator. This allocator uses information from our analysis to either rematerialize affine variables, or to move uniform variables to shared memory. As a testimony of its effectiveness, our divergence aware allocator produces GPU code that is 29.70% faster than the code produced by Ocelot's register allocator. Divergence analysis with affine constraints is publicly available in the Ocelot compiler since June/2012.","PeriodicalId":232444,"journal":{"name":"2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBAC-PAD.2012.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

Abstract

The rising popularity of graphics processing units is bringing renewed interest in code optimization techniques for SIMD processors. Many of these optimizations rely on divergence analyses, which classify variables as uniform, if they have the same value on every thread, or divergent, if they might not. This paper introduces a new kind of divergence analysis, that is able to represent variables as affine functions of thread identifiers. We have implemented this analysis in Ocelot, an open source compiler, and use it to analyze a suite of 177 CUDA kernels from well-known benchmarks. We can mark about one fourth of all program variables as affine functions of thread identifiers. In addition to the novel divergence analysis, we also introduce the notion of a divergence aware register allocator. This allocator uses information from our analysis to either rematerialize affine variables, or to move uniform variables to shared memory. As a testimony of its effectiveness, our divergence aware allocator produces GPU code that is 29.70% faster than the code produced by Ocelot's register allocator. Divergence analysis with affine constraints is publicly available in the Ocelot compiler since June/2012.
仿射约束下的散度分析
图形处理单元的日益流行重新引起了对SIMD处理器代码优化技术的兴趣。这些优化中的许多都依赖于散度分析,如果变量在每个线程上具有相同的值,则将其分类为均匀的,如果变量在每个线程上具有相同的值,则将其分类为发散的。本文介绍了一种新的散度分析方法,将变量表示为线程标识符的仿射函数。我们已经在Ocelot(一个开源编译器)中实现了这种分析,并使用它来分析来自知名基准测试的177个CUDA内核。我们可以将大约四分之一的程序变量标记为线程标识符的仿射函数。除了新的发散分析之外,我们还引入了发散感知寄存器分配器的概念。这个分配器使用我们分析的信息来重新实现仿射变量,或者将统一变量移动到共享内存中。作为其有效性的证明,我们的发散感知分配器生成的GPU代码比Ocelot的寄存器分配器生成的代码快29.70%。带有仿射约束的发散分析自2012年6月起在Ocelot编译器中公开可用。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信