FFTX and SpectralPack: A First Look

2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW) Pub Date : 2018-12-01 DOI:10.1109/HIPCW.2018.8634111

F. Franchetti, Daniele G. Spampinato, Anuva Kulkarni, Doru-Thom Popovici, Tze Meng Low, M. Franusich, A. Canning, P. McCorquodale, B. V. Straalen, P. Colella

{"title":"FFTX and SpectralPack: A First Look","authors":"F. Franchetti, Daniele G. Spampinato, Anuva Kulkarni, Doru-Thom Popovici, Tze Meng Low, M. Franusich, A. Canning, P. McCorquodale, B. V. Straalen, P. Colella","doi":"10.1109/HIPCW.2018.8634111","DOIUrl":null,"url":null,"abstract":"We propose FFTX, a new framework for building high-performance FFT-based applications on exascale machines. Complex node architectures lead to multiple levels of parallelism and demand efficient ways of data communication. The current FFTW interface falls short in maximizing performance in such scenarios. FFTX is designed to enable application developers to leverage expert-level, automatic optimizations while navigating a familiar interface. FFTX is backwards compatible to FFTW and extends the FFTW Interface into an embedded Domain Specific Language (DSL) expressed as a library interface. By means of a SPIRAL-based back end, this enables build-time source-to-source translation and advanced performance optimizations, such as cross-library calls optimizations, targeting of accelerators through offload-ing, and inlining of user-provided kernels. We demonstrate the use of FFTX with the prototypical example of 1D and 3D pruned convolutions and discuss future extensions.","PeriodicalId":401060,"journal":{"name":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","volume":"88 ","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HIPCW.2018.8634111","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 18

Abstract

We propose FFTX, a new framework for building high-performance FFT-based applications on exascale machines. Complex node architectures lead to multiple levels of parallelism and demand efficient ways of data communication. The current FFTW interface falls short in maximizing performance in such scenarios. FFTX is designed to enable application developers to leverage expert-level, automatic optimizations while navigating a familiar interface. FFTX is backwards compatible to FFTW and extends the FFTW Interface into an embedded Domain Specific Language (DSL) expressed as a library interface. By means of a SPIRAL-based back end, this enables build-time source-to-source translation and advanced performance optimizations, such as cross-library calls optimizations, targeting of accelerators through offload-ing, and inlining of user-provided kernels. We demonstrate the use of FFTX with the prototypical example of 1D and 3D pruned convolutions and discuss future extensions.

查看原文本刊更多论文

FFTX和SpectralPack:第一眼

我们提出了FFTX，一个在百亿亿级机器上构建高性能基于FFTX的应用程序的新框架。复杂的节点架构导致了多层次的并行性，并要求高效的数据通信方式。目前的FFTW接口在这种情况下无法最大限度地提高性能。FFTX旨在使应用程序开发人员能够在导航熟悉的界面时利用专家级的自动优化。FFTX向后兼容FFTW，并将FFTW接口扩展为嵌入式领域特定语言(DSL)，表示为库接口。通过基于螺旋的后端，它支持构建时的源到源转换和高级性能优化，例如跨库调用优化、通过卸载瞄准加速器以及内联用户提供的内核。我们用1D和3D修剪卷积的原型示例演示了FFTX的使用，并讨论了未来的扩展。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2018 IEEE 25th International Conference on High Performance Computing Workshops (HiPCW)

自引率

0.00%

发文量