{"title":"用fpga实现多线程数据流多处理器","authors":"K. Tatas, C. Kyriacou","doi":"10.1109/DTIS.2011.5941444","DOIUrl":null,"url":null,"abstract":"This paper presents the FPGA implementation and evaluation of the prototype for a Data-Driven Multithreading Chip-Multiprocessor. In particular, we study the implementation of a Thread Synchronization Unit (TSU) on FPGA, a hardware unit that enables thread execution using dataflow rules on a chip multiprocessor. Threads are scheduled for execution based on data availability, i.e. a thread is fired only if its input data is available. This model of execution is called the non-blocking Data-Driven Multithreading (DDM) model of execution. Due to its dataflow characteristics, this model exploits parallelism and tolerates latency. The DDM model has been evaluated using an execution driven simulator and showed and average speedup of 26 on a 32-node system. For evaluation purposes, implementation on Xilinx Virtex-5 FPGA using the Microblaze processors as execution cores has been performed. Experimental results show that the TSU can be implemented with a moderate hardware budget, and that delays incurred by the operation of the TSU can be tolerated. Furthermore, hardware complexity evaluation shows that the TSU size scales very well with the number of processors in the MPSoC.","PeriodicalId":409387,"journal":{"name":"2011 6th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-04-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Implementation of a threaded dataflow multiprocessor using FPGAs\",\"authors\":\"K. Tatas, C. Kyriacou\",\"doi\":\"10.1109/DTIS.2011.5941444\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents the FPGA implementation and evaluation of the prototype for a Data-Driven Multithreading Chip-Multiprocessor. In particular, we study the implementation of a Thread Synchronization Unit (TSU) on FPGA, a hardware unit that enables thread execution using dataflow rules on a chip multiprocessor. Threads are scheduled for execution based on data availability, i.e. a thread is fired only if its input data is available. This model of execution is called the non-blocking Data-Driven Multithreading (DDM) model of execution. Due to its dataflow characteristics, this model exploits parallelism and tolerates latency. The DDM model has been evaluated using an execution driven simulator and showed and average speedup of 26 on a 32-node system. For evaluation purposes, implementation on Xilinx Virtex-5 FPGA using the Microblaze processors as execution cores has been performed. Experimental results show that the TSU can be implemented with a moderate hardware budget, and that delays incurred by the operation of the TSU can be tolerated. Furthermore, hardware complexity evaluation shows that the TSU size scales very well with the number of processors in the MPSoC.\",\"PeriodicalId\":409387,\"journal\":{\"name\":\"2011 6th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2011-04-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2011 6th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DTIS.2011.5941444\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 6th International Conference on Design & Technology of Integrated Systems in Nanoscale Era (DTIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DTIS.2011.5941444","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Implementation of a threaded dataflow multiprocessor using FPGAs
This paper presents the FPGA implementation and evaluation of the prototype for a Data-Driven Multithreading Chip-Multiprocessor. In particular, we study the implementation of a Thread Synchronization Unit (TSU) on FPGA, a hardware unit that enables thread execution using dataflow rules on a chip multiprocessor. Threads are scheduled for execution based on data availability, i.e. a thread is fired only if its input data is available. This model of execution is called the non-blocking Data-Driven Multithreading (DDM) model of execution. Due to its dataflow characteristics, this model exploits parallelism and tolerates latency. The DDM model has been evaluated using an execution driven simulator and showed and average speedup of 26 on a 32-node system. For evaluation purposes, implementation on Xilinx Virtex-5 FPGA using the Microblaze processors as execution cores has been performed. Experimental results show that the TSU can be implemented with a moderate hardware budget, and that delays incurred by the operation of the TSU can be tolerated. Furthermore, hardware complexity evaluation shows that the TSU size scales very well with the number of processors in the MPSoC.