FPGA上可扩展IP查找引擎的架构和性能模型

2013 IEEE 14th International Conference on High Performance Switching and Routing (HPSR) Pub Date : 2013-07-08 DOI:10.1109/HPSR.2013.6602306

Y. Yang, Yun Qu, Swapnil Haria, V. Prasanna

{"title":"FPGA上可扩展IP查找引擎的架构和性能模型","authors":"Y. Yang, Yun Qu, Swapnil Haria, V. Prasanna","doi":"10.1109/HPSR.2013.6602306","DOIUrl":null,"url":null,"abstract":"We propose a unified methodology for optimizing IPv4 and IPv6 lookup engines based on the balanced range tree (BRTree) architecture on FPGA. A general BRTree-based IP lookup solution features one or more linear pipelines with a large and complex design space. To allow fast exploration of the design space, we develop a concise set of performance models to characterize the tradeoffs among throughput, table size, lookup latency, and resource requirement of the IP lookup engine. In particular, a simple but realistic model of DDR3 memory is used to accurately estimate the off-chip memory performance. The models are then utilized by the proposed methodology to optimize for high lookup rates, large prefix tables, and a fixed maximum lookup latency, respectively. In our prototyping scenarios, a state-of-the-art FPGA could support (1) up to 24 M IPv6 prefixes with 400 Mlps (million lookups per second); (2) up to 1.6 Blps (billion lookups per second) with 1.1 M IPv4 prefixes; and (3) up to 554 K IPv4 prefixes and 400 Mlps with a lookup latency bounded in 400 ns. All our designs achieve 5.6x - 70x the energy efficiency of TCAM, and have performance independent of the prefix distribution.","PeriodicalId":220418,"journal":{"name":"2013 IEEE 14th International Conference on High Performance Switching and Routing (HPSR)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"Architecture and performance models for scalable IP lookup engines on FPGA\",\"authors\":\"Y. Yang, Yun Qu, Swapnil Haria, V. Prasanna\",\"doi\":\"10.1109/HPSR.2013.6602306\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a unified methodology for optimizing IPv4 and IPv6 lookup engines based on the balanced range tree (BRTree) architecture on FPGA. A general BRTree-based IP lookup solution features one or more linear pipelines with a large and complex design space. To allow fast exploration of the design space, we develop a concise set of performance models to characterize the tradeoffs among throughput, table size, lookup latency, and resource requirement of the IP lookup engine. In particular, a simple but realistic model of DDR3 memory is used to accurately estimate the off-chip memory performance. The models are then utilized by the proposed methodology to optimize for high lookup rates, large prefix tables, and a fixed maximum lookup latency, respectively. In our prototyping scenarios, a state-of-the-art FPGA could support (1) up to 24 M IPv6 prefixes with 400 Mlps (million lookups per second); (2) up to 1.6 Blps (billion lookups per second) with 1.1 M IPv4 prefixes; and (3) up to 554 K IPv4 prefixes and 400 Mlps with a lookup latency bounded in 400 ns. All our designs achieve 5.6x - 70x the energy efficiency of TCAM, and have performance independent of the prefix distribution.\",\"PeriodicalId\":220418,\"journal\":{\"name\":\"2013 IEEE 14th International Conference on High Performance Switching and Routing (HPSR)\",\"volume\":\"43 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-07-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 IEEE 14th International Conference on High Performance Switching and Routing (HPSR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPSR.2013.6602306\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 IEEE 14th International Conference on High Performance Switching and Routing (HPSR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPSR.2013.6602306","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

我们提出了一种基于FPGA平衡范围树(BRTree)架构的统一方法来优化IPv4和IPv6查找引擎。一般基于brtree的IP查找解决方案具有一个或多个具有大型复杂设计空间的线性管道。为了允许快速探索设计空间，我们开发了一组简洁的性能模型来描述吞吐量、表大小、查找延迟和IP查找引擎的资源需求之间的权衡。特别地，一个简单而真实的DDR3存储器模型被用来准确地估计片外存储器的性能。然后，建议的方法利用这些模型分别针对高查找率、大前缀表和固定的最大查找延迟进行优化。在我们的原型场景中，最先进的FPGA可以支持(1)最多24 M IPv6前缀，速度为400 Mlps(每秒百万次查找);(2)高达1.6 Blps(每秒十亿次查找)，使用1.1 M IPv4前缀;(3)最多554 K IPv4前缀和400 Mlps，查找延迟限制在400 ns。我们所有的设计都达到了TCAM的5.6 - 70倍的能源效率，并且具有独立于前缀分布的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Architecture and performance models for scalable IP lookup engines on FPGA

We propose a unified methodology for optimizing IPv4 and IPv6 lookup engines based on the balanced range tree (BRTree) architecture on FPGA. A general BRTree-based IP lookup solution features one or more linear pipelines with a large and complex design space. To allow fast exploration of the design space, we develop a concise set of performance models to characterize the tradeoffs among throughput, table size, lookup latency, and resource requirement of the IP lookup engine. In particular, a simple but realistic model of DDR3 memory is used to accurately estimate the off-chip memory performance. The models are then utilized by the proposed methodology to optimize for high lookup rates, large prefix tables, and a fixed maximum lookup latency, respectively. In our prototyping scenarios, a state-of-the-art FPGA could support (1) up to 24 M IPv6 prefixes with 400 Mlps (million lookups per second); (2) up to 1.6 Blps (billion lookups per second) with 1.1 M IPv4 prefixes; and (3) up to 554 K IPv4 prefixes and 400 Mlps with a lookup latency bounded in 400 ns. All our designs achieve 5.6x - 70x the energy efficiency of TCAM, and have performance independent of the prefix distribution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2013 IEEE 14th International Conference on High Performance Switching and Routing (HPSR)

自引率

0.00%

发文量