FPGA上大型平衡树结构的可扩展高吞吐量架构(仅抽象)

FPGA. ACM International Symposium on Field-Programmable Gate Arrays Pub Date : 2013-02-11 DOI:10.1145/2435264.2435345

Yun Qu, V. Prasanna

{"title":"FPGA上大型平衡树结构的可扩展高吞吐量架构(仅抽象)","authors":"Yun Qu, V. Prasanna","doi":"10.1145/2435264.2435345","DOIUrl":null,"url":null,"abstract":"Architectures for tree structures on FPGAs as well as ASICs have been proposed over the years. The exponential growth in the memory size with respect to the tree levels restricts the scalability of these architectures due to limited on-chip memory. For large trees, off-chip memory has to be used. We propose a pipeline architecture on FPGA for large balanced tree structures which achieves both scalability and high throughput. In the proposed architecture, each tree level is mapped onto a single or multiple Processing Elements (PEs) using dual-port distributed RAM, dual-port block RAM and off-chip RAM. We parameterize the pipeline architecture and optimize the performance with respect to scalability and throughput. The resulting architecture for the search tree is dual-threaded and deeply pipelined. It can accept two search requests per clock cycle and operates at a high clock rate of 280MHz. Post place-and-route results show that, by using only 17% of the logic resources and 9% of the BRAM available on a state-of-the-art FPGA, our dual-thread pipelined search tree can perform 560 million search operations per second in a tree containing 512K 64-bit keys.","PeriodicalId":87257,"journal":{"name":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","volume":"29 1","pages":"278"},"PeriodicalIF":0.0000,"publicationDate":"2013-02-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Scalable high-throughput architecture for large balanced tree structures on FPGA (abstract only)\",\"authors\":\"Yun Qu, V. Prasanna\",\"doi\":\"10.1145/2435264.2435345\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Architectures for tree structures on FPGAs as well as ASICs have been proposed over the years. The exponential growth in the memory size with respect to the tree levels restricts the scalability of these architectures due to limited on-chip memory. For large trees, off-chip memory has to be used. We propose a pipeline architecture on FPGA for large balanced tree structures which achieves both scalability and high throughput. In the proposed architecture, each tree level is mapped onto a single or multiple Processing Elements (PEs) using dual-port distributed RAM, dual-port block RAM and off-chip RAM. We parameterize the pipeline architecture and optimize the performance with respect to scalability and throughput. The resulting architecture for the search tree is dual-threaded and deeply pipelined. It can accept two search requests per clock cycle and operates at a high clock rate of 280MHz. Post place-and-route results show that, by using only 17% of the logic resources and 9% of the BRAM available on a state-of-the-art FPGA, our dual-thread pipelined search tree can perform 560 million search operations per second in a tree containing 512K 64-bit keys.\",\"PeriodicalId\":87257,\"journal\":{\"name\":\"FPGA. ACM International Symposium on Field-Programmable Gate Arrays\",\"volume\":\"29 1\",\"pages\":\"278\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-02-11\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"FPGA. ACM International Symposium on Field-Programmable Gate Arrays\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2435264.2435345\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"FPGA. ACM International Symposium on Field-Programmable Gate Arrays","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2435264.2435345","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

多年来，在fpga和asic上提出了树形结构的体系结构。由于片上内存有限，相对于树级别的内存大小的指数级增长限制了这些体系结构的可伸缩性。对于大型树，必须使用片外存储器。在FPGA上提出了一种大型平衡树型结构的流水线架构，实现了可扩展性和高吞吐量。在提出的架构中，每个树级都被映射到使用双端口分布式RAM、双端口块RAM和片外RAM的单个或多个处理元素(pe)上。我们对管道架构进行了参数化，并在可扩展性和吞吐量方面优化了性能。搜索树的最终架构是双线程和深度流水线的。它每个时钟周期可以接受两个搜索请求，并以280MHz的高时钟速率工作。放置和路由后的结果表明，在最先进的FPGA上仅使用17%的逻辑资源和9%的BRAM，我们的双线程流水线搜索树每秒可以在包含512K 64位密钥的树中执行5.6亿次搜索操作。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Scalable high-throughput architecture for large balanced tree structures on FPGA (abstract only)

Architectures for tree structures on FPGAs as well as ASICs have been proposed over the years. The exponential growth in the memory size with respect to the tree levels restricts the scalability of these architectures due to limited on-chip memory. For large trees, off-chip memory has to be used. We propose a pipeline architecture on FPGA for large balanced tree structures which achieves both scalability and high throughput. In the proposed architecture, each tree level is mapped onto a single or multiple Processing Elements (PEs) using dual-port distributed RAM, dual-port block RAM and off-chip RAM. We parameterize the pipeline architecture and optimize the performance with respect to scalability and throughput. The resulting architecture for the search tree is dual-threaded and deeply pipelined. It can accept two search requests per clock cycle and operates at a high clock rate of 280MHz. Post place-and-route results show that, by using only 17% of the logic resources and 9% of the BRAM available on a state-of-the-art FPGA, our dual-thread pipelined search tree can perform 560 million search operations per second in a tree containing 512K 64-bit keys.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

FPGA. ACM International Symposium on Field-Programmable Gate Arrays

自引率

0.00%

发文量