{"title":"素因子非二进制离散傅里叶变换和使用Crystal_Router作为通用通信例程","authors":"G. Aloisio, Nicola Veneziani, Jai Sam Kim, G. Fox","doi":"10.1145/63047.63087","DOIUrl":null,"url":null,"abstract":"We have implemented one of the Fast Fourier Transform algorithms, the Prime Factor algorithm (PFA), on the hypercube. On sequential computers, the PFA and other discrete Fourier transforms (DFT) such as the Winograd algorithm (WFA) are known to be very efficient. However, both algorithms require full data shuffling and are thus challenging to any distributed memory parallel computers. We use a concurrent communication algorithm, called the Crystal_Router for communicating shuffled data. We will show that the speed gained in reduced arithmetic compared to binary FFT is sufficient to overcome the extra communication requirement up to a certain number of processors. Beyond this point the standard Cooley-Tukey FFT algorithm has the best performance. We comment briefly on the application of the DFT to signal processing in synthetic aperture radar (SAR).","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"17 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"The prime factor non-binary discrete Fourier transform and use of Crystal_Router as a general purpose communication routine\",\"authors\":\"G. Aloisio, Nicola Veneziani, Jai Sam Kim, G. Fox\",\"doi\":\"10.1145/63047.63087\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We have implemented one of the Fast Fourier Transform algorithms, the Prime Factor algorithm (PFA), on the hypercube. On sequential computers, the PFA and other discrete Fourier transforms (DFT) such as the Winograd algorithm (WFA) are known to be very efficient. However, both algorithms require full data shuffling and are thus challenging to any distributed memory parallel computers. We use a concurrent communication algorithm, called the Crystal_Router for communicating shuffled data. We will show that the speed gained in reduced arithmetic compared to binary FFT is sufficient to overcome the extra communication requirement up to a certain number of processors. Beyond this point the standard Cooley-Tukey FFT algorithm has the best performance. We comment briefly on the application of the DFT to signal processing in synthetic aperture radar (SAR).\",\"PeriodicalId\":299435,\"journal\":{\"name\":\"Conference on Hypercube Concurrent Computers and Applications\",\"volume\":\"17 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1989-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference on Hypercube Concurrent Computers and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/63047.63087\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Hypercube Concurrent Computers and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/63047.63087","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The prime factor non-binary discrete Fourier transform and use of Crystal_Router as a general purpose communication routine
We have implemented one of the Fast Fourier Transform algorithms, the Prime Factor algorithm (PFA), on the hypercube. On sequential computers, the PFA and other discrete Fourier transforms (DFT) such as the Winograd algorithm (WFA) are known to be very efficient. However, both algorithms require full data shuffling and are thus challenging to any distributed memory parallel computers. We use a concurrent communication algorithm, called the Crystal_Router for communicating shuffled data. We will show that the speed gained in reduced arithmetic compared to binary FFT is sufficient to overcome the extra communication requirement up to a certain number of processors. Beyond this point the standard Cooley-Tukey FFT algorithm has the best performance. We comment briefly on the application of the DFT to signal processing in synthetic aperture radar (SAR).