D. Stevens, N. Glynn, Panagiotis Galiatsatos, V. Chouliaras, D. Reisis
{"title":"评估可配置的、可扩展的VLIW处理器在FFT执行中的性能","authors":"D. Stevens, N. Glynn, Panagiotis Galiatsatos, V. Chouliaras, D. Reisis","doi":"10.1109/ICECS.2009.5410773","DOIUrl":null,"url":null,"abstract":"This paper presents the setup and the evaluation of the LE1 configurable, extensible, multi-cluster VLIW processor system in FFT execution. The input code is a C implementation of the FFT algorithm and we evaluate its performance on the LE1 simulator for multiple CPU configurations (issue width, execution resource mix, custom instruction) and compiler optimizations (inlining, loop unrolling) in an effort to optimize the cycle count. We identify the prevailing LE1 configurations, with respect to the FFT cycle performance, their silicon area and the power dissipation. Finally, we compare these results to a fully systolic single datapath delay feedback (SDF) VLSI FFT architecture derived from the same C code.","PeriodicalId":343974,"journal":{"name":"2009 16th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2009)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Evaluating the performance of a configurable, extensible VLIW processor in FFT execution\",\"authors\":\"D. Stevens, N. Glynn, Panagiotis Galiatsatos, V. Chouliaras, D. Reisis\",\"doi\":\"10.1109/ICECS.2009.5410773\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents the setup and the evaluation of the LE1 configurable, extensible, multi-cluster VLIW processor system in FFT execution. The input code is a C implementation of the FFT algorithm and we evaluate its performance on the LE1 simulator for multiple CPU configurations (issue width, execution resource mix, custom instruction) and compiler optimizations (inlining, loop unrolling) in an effort to optimize the cycle count. We identify the prevailing LE1 configurations, with respect to the FFT cycle performance, their silicon area and the power dissipation. Finally, we compare these results to a fully systolic single datapath delay feedback (SDF) VLSI FFT architecture derived from the same C code.\",\"PeriodicalId\":343974,\"journal\":{\"name\":\"2009 16th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2009)\",\"volume\":\"49 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 16th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2009)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICECS.2009.5410773\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 16th IEEE International Conference on Electronics, Circuits and Systems - (ICECS 2009)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICECS.2009.5410773","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluating the performance of a configurable, extensible VLIW processor in FFT execution
This paper presents the setup and the evaluation of the LE1 configurable, extensible, multi-cluster VLIW processor system in FFT execution. The input code is a C implementation of the FFT algorithm and we evaluate its performance on the LE1 simulator for multiple CPU configurations (issue width, execution resource mix, custom instruction) and compiler optimizations (inlining, loop unrolling) in an effort to optimize the cycle count. We identify the prevailing LE1 configurations, with respect to the FFT cycle performance, their silicon area and the power dissipation. Finally, we compare these results to a fully systolic single datapath delay feedback (SDF) VLSI FFT architecture derived from the same C code.