{"title":"比较利用流水线 FFT 和基于内存的 FHT 架构的两种最新方法,以实现资源高效型并行计算真实数据 DFT","authors":"","doi":"10.33140/oajast.01.02.01","DOIUrl":null,"url":null,"abstract":"This paper provides a comparison and assessment of both the performance and the capabilities of two recently developed approaches to the problem of computing the real-data DFT. The approaches exploit pipelined FFT and memory-based FHT architectures and aim to produce resource-efficient parallel solutions as required for use in resource and power constrained environments. The FFT-based solutions involve multi PE pipelined designs, geared to streaming (or serial) operation, that exploit the conjugate symmetric nature of the real-data DFT spectrum. The FHT based solutions, which are suitably optimized versions of the regularized FHT, are geared to batch (or block-based) operation and involve a memory-based single-PE design that exploits partitioned memory in order to achieve eight fold parallelism within the PE. After outlining the performance objectives of each approach the study highlights the key properties and relative advantages/disadvantages of each, showing how the arithmetic complexity may be traded off against the memory requirement in order to optimize the use of the available silicon resources on the target computing device and to meet the appropriate timing objectives or constraints. A number of additional design issues not addressed with recent real-data FFT research – in particular, those relating to design simplicity, regularity and scalability – are also discussed which enable a more comprehensive assessment of a solution’s capabilities.","PeriodicalId":285617,"journal":{"name":"Open Access Journal of Applied Science and Technology","volume":"32 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"A Comparison of Two Recent Approaches, Exploiting Pipelined FFT and Memory‑Based FHT Architectures, for Resource-Efficient Parallel Computation of Real-Data DFT\",\"authors\":\"\",\"doi\":\"10.33140/oajast.01.02.01\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper provides a comparison and assessment of both the performance and the capabilities of two recently developed approaches to the problem of computing the real-data DFT. The approaches exploit pipelined FFT and memory-based FHT architectures and aim to produce resource-efficient parallel solutions as required for use in resource and power constrained environments. The FFT-based solutions involve multi PE pipelined designs, geared to streaming (or serial) operation, that exploit the conjugate symmetric nature of the real-data DFT spectrum. The FHT based solutions, which are suitably optimized versions of the regularized FHT, are geared to batch (or block-based) operation and involve a memory-based single-PE design that exploits partitioned memory in order to achieve eight fold parallelism within the PE. After outlining the performance objectives of each approach the study highlights the key properties and relative advantages/disadvantages of each, showing how the arithmetic complexity may be traded off against the memory requirement in order to optimize the use of the available silicon resources on the target computing device and to meet the appropriate timing objectives or constraints. A number of additional design issues not addressed with recent real-data FFT research – in particular, those relating to design simplicity, regularity and scalability – are also discussed which enable a more comprehensive assessment of a solution’s capabilities.\",\"PeriodicalId\":285617,\"journal\":{\"name\":\"Open Access Journal of Applied Science and Technology\",\"volume\":\"32 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-07-14\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Open Access Journal of Applied Science and Technology\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.33140/oajast.01.02.01\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Open Access Journal of Applied Science and Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.33140/oajast.01.02.01","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
本文比较和评估了最近针对实际数据 DFT 计算问题开发的两种方法的性能和功能。这两种方法利用流水线 FFT 和基于内存的 FHT 架构,旨在产生资源节约型并行解决方案,以满足在资源和功耗受限的环境中使用的要求。基于 FFT 的解决方案涉及流水(或串行)操作的多 PE 流水线设计,利用了真实数据 DFT 频谱的共轭对称性。基于 FHT 的解决方案是正则化 FHT 的适当优化版本,适用于批处理(或基于块)操作,涉及基于内存的单 PE 设计,利用分区内存在 PE 内实现八倍并行性。在概述了每种方法的性能目标后,研究强调了每种方法的关键特性和相对优势/劣势,展示了如何将算术复杂性与内存要求进行权衡,以优化目标计算设备上可用硅资源的使用,并满足适当的时序目标或约束。此外,还讨论了近期实际数据 FFT 研究未涉及的其他一些设计问题,特别是与设计简洁性、规则性和可扩展性有关的问题,以便对解决方案的能力进行更全面的评估。
A Comparison of Two Recent Approaches, Exploiting Pipelined FFT and Memory‑Based FHT Architectures, for Resource-Efficient Parallel Computation of Real-Data DFT
This paper provides a comparison and assessment of both the performance and the capabilities of two recently developed approaches to the problem of computing the real-data DFT. The approaches exploit pipelined FFT and memory-based FHT architectures and aim to produce resource-efficient parallel solutions as required for use in resource and power constrained environments. The FFT-based solutions involve multi PE pipelined designs, geared to streaming (or serial) operation, that exploit the conjugate symmetric nature of the real-data DFT spectrum. The FHT based solutions, which are suitably optimized versions of the regularized FHT, are geared to batch (or block-based) operation and involve a memory-based single-PE design that exploits partitioned memory in order to achieve eight fold parallelism within the PE. After outlining the performance objectives of each approach the study highlights the key properties and relative advantages/disadvantages of each, showing how the arithmetic complexity may be traded off against the memory requirement in order to optimize the use of the available silicon resources on the target computing device and to meet the appropriate timing objectives or constraints. A number of additional design issues not addressed with recent real-data FFT research – in particular, those relating to design simplicity, regularity and scalability – are also discussed which enable a more comprehensive assessment of a solution’s capabilities.