Design of a modern fast Fourier transform and cache effective bit-reversal algorithm

IF 0.7 Q4 COMPUTER SCIENCE, THEORY & METHODS

International Journal of Parallel Emergent and Distributed Systems Pub Date : 2023-03-02 DOI:10.1080/17445760.2023.2179049

Adam Simek, I. Šimeček

引用次数: 0

Abstract

ABSTRACT This article deals with efficient vectorization of the fast Fourier transform algorithm while focusing on Cooley–Tukey versions with power-of-two radixes. Aside from examples of optimizations for 256 and 512-bit vectors, this work also discusses relations between individual radix-based versions, vectorization and OpenMP threading. Ideas are progressing into a timeless design of the FFT algorithm, which can work with any vector size and radix version through conversion into radix-2 output permutation. Furthermore, the implementation of the Cache Optimized Bit-Reversal algorithm, which doubles the performance of its predecessor, is introduced.

查看原文本刊更多论文

一种现代快速傅立叶变换和缓存有效位反转算法的设计

摘要本文讨论了快速傅立叶变换算法的有效矢量化，同时重点讨论了具有两个基数幂的Cooley–Tukey版本。除了256和512位矢量的优化示例外，这项工作还讨论了基于基数的各个版本、矢量化和OpenMP线程之间的关系。FFT算法的思想正在发展成为一种永恒的设计，它可以通过转换为基数-2输出排列来处理任何向量大小和基数版本。此外，还介绍了缓存优化比特反转算法的实现，该算法的性能比前代算法提高了一倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

International Journal of Parallel Emergent and Distributed Systems COMPUTER SCIENCE, THEORY & METHODS-

CiteScore

2.30

自引率

0.00%

发文量