Revisiting loop fusion in the polyhedral framework

Sanyam Mehta, P. Lin, P. Yew
{"title":"Revisiting loop fusion in the polyhedral framework","authors":"Sanyam Mehta, P. Lin, P. Yew","doi":"10.1145/2555243.2555250","DOIUrl":null,"url":null,"abstract":"Loop fusion is an important compiler optimization for improving memory hierarchy performance through enabling data reuse. Traditional compilers have approached loop fusion in a manner decoupled from other high-level loop optimizations, missing several interesting solutions. Recently, the polyhedral compiler framework with its ability to compose complex transformations, has proved to be promising in performing loop optimizations for small programs. However, our experiments with large programs using state-of-the-art polyhedral compiler frameworks reveal suboptimal fusion partitions in the transformed code. We trace the reason for this to be lack of an effective cost model to choose a good fusion partitioning among the possible choices, which increase exponentially with the number of program statements. In this paper, we propose a fusion algorithm to choose good fusion partitions with two objective functions - achieving good data reuse and preserving parallelism inherent in the source code. These objectives, although targeted by previous work in traditional compilers, pose new challenges within the polyhedral compiler framework and have thus not been addressed. In our algorithm, we propose several heuristics that work effectively within the polyhedral compiler framework and allow us to achieve the proposed objectives. Experimental results show that our fusion algorithm achieves performance comparable to the existing polyhedral compilers for small kernel programs, and significantly outperforms them for large benchmark programs such as those in the SPEC benchmark suite.","PeriodicalId":286119,"journal":{"name":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","volume":"15 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"27","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM SIGPLAN Symposium on Principles & Practice of Parallel Programming","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2555243.2555250","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 27

Abstract

Loop fusion is an important compiler optimization for improving memory hierarchy performance through enabling data reuse. Traditional compilers have approached loop fusion in a manner decoupled from other high-level loop optimizations, missing several interesting solutions. Recently, the polyhedral compiler framework with its ability to compose complex transformations, has proved to be promising in performing loop optimizations for small programs. However, our experiments with large programs using state-of-the-art polyhedral compiler frameworks reveal suboptimal fusion partitions in the transformed code. We trace the reason for this to be lack of an effective cost model to choose a good fusion partitioning among the possible choices, which increase exponentially with the number of program statements. In this paper, we propose a fusion algorithm to choose good fusion partitions with two objective functions - achieving good data reuse and preserving parallelism inherent in the source code. These objectives, although targeted by previous work in traditional compilers, pose new challenges within the polyhedral compiler framework and have thus not been addressed. In our algorithm, we propose several heuristics that work effectively within the polyhedral compiler framework and allow us to achieve the proposed objectives. Experimental results show that our fusion algorithm achieves performance comparable to the existing polyhedral compilers for small kernel programs, and significantly outperforms them for large benchmark programs such as those in the SPEC benchmark suite.
重述多面体框架中的环融合
循环融合是一项重要的编译器优化,通过启用数据重用来提高内存层次结构性能。传统的编译器以一种与其他高级循环优化解耦的方式处理循环融合,错过了几个有趣的解决方案。最近,具有组合复杂转换能力的多面体编译器框架已被证明在执行小程序的循环优化方面很有前途。然而,我们使用最先进的多面体编译器框架对大型程序进行的实验揭示了转换代码中的次优融合分区。我们认为其原因是缺乏一个有效的成本模型来在可能的选择中选择一个好的融合划分,而这些选择随着程序语句的数量呈指数级增长。本文提出了一种基于两个目标函数选择好的融合分区的融合算法——实现良好的数据重用和保持源代码固有的并行性。这些目标虽然是传统编译器以前工作的目标,但在多面体编译器框架内提出了新的挑战,因此尚未得到解决。在我们的算法中,我们提出了几种在多面体编译器框架内有效工作的启发式方法,并允许我们实现所提出的目标。实验结果表明,我们的融合算法在小型内核程序上的性能与现有的多面体编译器相当,在大型基准程序(如SPEC基准套件)上的性能明显优于现有的多面体编译器。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信