A compiler framework for extracting superword level parallelism

Jun Liu, Yuanrui Zhang, Ohyoung Jang, W. Ding, M. Kandemir
{"title":"A compiler framework for extracting superword level parallelism","authors":"Jun Liu, Yuanrui Zhang, Ohyoung Jang, W. Ding, M. Kandemir","doi":"10.1145/2254064.2254106","DOIUrl":null,"url":null,"abstract":"SIMD (single-instruction multiple-data) instruction set extensions are quite common today in both high performance and embedded microprocessors, and enable the exploitation of a specific type of data parallelism called SLP (Superword Level Parallelism). While prior research shows that significant performance savings are possible when SLP is exploited, placing SIMD instructions in an application code manually can be very difficult and error prone. In this paper, we propose a novel automated compiler framework for improving superword level parallelism exploitation. The key part of our framework consists of two stages: superword statement generation and data layout optimization. The first stage is our main contribution and has two phases, statement grouping and statement scheduling, of which the primary goals are to increase SIMD parallelism and, more importantly, capture more superword reuses among the superword statements through global data access and reuse pattern analysis. Further, as a complementary optimization, our data layout optimization organizes data in memory space such that the price of memory operations for SLP is minimized. The results from our compiler implementation and tests on two systems indicate performance improvements as high as 15.2% over a state-of-the-art SLP optimization algorithm.","PeriodicalId":308121,"journal":{"name":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation","volume":"122 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-06-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"48","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 33rd ACM SIGPLAN Conference on Programming Language Design and Implementation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2254064.2254106","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 48

Abstract

SIMD (single-instruction multiple-data) instruction set extensions are quite common today in both high performance and embedded microprocessors, and enable the exploitation of a specific type of data parallelism called SLP (Superword Level Parallelism). While prior research shows that significant performance savings are possible when SLP is exploited, placing SIMD instructions in an application code manually can be very difficult and error prone. In this paper, we propose a novel automated compiler framework for improving superword level parallelism exploitation. The key part of our framework consists of two stages: superword statement generation and data layout optimization. The first stage is our main contribution and has two phases, statement grouping and statement scheduling, of which the primary goals are to increase SIMD parallelism and, more importantly, capture more superword reuses among the superword statements through global data access and reuse pattern analysis. Further, as a complementary optimization, our data layout optimization organizes data in memory space such that the price of memory operations for SLP is minimized. The results from our compiler implementation and tests on two systems indicate performance improvements as high as 15.2% over a state-of-the-art SLP optimization algorithm.
一个用于提取超词级并行性的编译器框架
SIMD(单指令多数据)指令集扩展如今在高性能微处理器和嵌入式微处理器中都很常见,并且支持利用一种称为SLP (Superword Level parallelism)的特定类型的数据并行性。虽然先前的研究表明,利用SLP可以显著节省性能,但在应用程序代码中手动放置SIMD指令可能非常困难,而且容易出错。在本文中,我们提出了一个新的自动编译框架,以提高超词级并行性的利用。我们的框架的关键部分包括两个阶段:超级词语句生成和数据布局优化。第一个阶段是我们的主要贡献,它有两个阶段:语句分组和语句调度,其中的主要目标是增加SIMD并行性,更重要的是,通过全局数据访问和重用模式分析在超词语句之间捕获更多的超词重用。此外,作为补充优化,我们的数据布局优化在内存空间中组织数据,使SLP的内存操作成本最小化。我们在两个系统上的编译器实现和测试结果表明,与最先进的SLP优化算法相比,性能提高高达15.2%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信