Survey on and re-evaluation of wide adder architectures on FPGAs

2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig) Pub Date : 2016-11-01 DOI:10.1109/ReConFig.2016.7857189

Thomas B. Preußer, Markus Krause

{"title":"Survey on and re-evaluation of wide adder architectures on FPGAs","authors":"Thomas B. Preußer, Markus Krause","doi":"10.1109/ReConFig.2016.7857189","DOIUrl":null,"url":null,"abstract":"The binary word addition was one of the earliest operations that called for special-purpose hardware structures on otherwise freely programmable logic devices. The large logic depth induced by the great fanin that comprises both operands of the addition is especially harmful in SRAM-programmed FPGAs where the delays of the configurable inter-LUT routing are expensive in comparison to the delays of the connected logic stages. These costs have been addressed by carry chains that establish direct links through a linear series of configurable logic blocks on virtually all modern FPGA devices. This structure is so effective that it puts a simple linear adder layout at a great advantage. Although it must eventually recede behind more sophisticated hierarchical adder structures with logarithmic delays, the actual turning point has been pushed beyond operand widths of 50 bits or more. Thus, many designs can simply rely on the default addition that is so well supported directly by the hardware. This changes in the context of extraordinarily wide operands as they are often found in cryptographic applications. They require designers to identify an appropriate wide adder implementation that is able to meet their design goals. The typical bottleneck imposed by the wide fanin of addition is the achievable clock rate. Various authors have analyzed the performance of the classic fast addition schemes and proposed adder architectures that genuinely blend classic hierarchical approaches with the capabilities of the fast carry chains. This paper presents a survey across these proposals and re-evaluates them in the context of modern FPGA devices.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReConFig.2016.7857189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 2

Abstract

The binary word addition was one of the earliest operations that called for special-purpose hardware structures on otherwise freely programmable logic devices. The large logic depth induced by the great fanin that comprises both operands of the addition is especially harmful in SRAM-programmed FPGAs where the delays of the configurable inter-LUT routing are expensive in comparison to the delays of the connected logic stages. These costs have been addressed by carry chains that establish direct links through a linear series of configurable logic blocks on virtually all modern FPGA devices. This structure is so effective that it puts a simple linear adder layout at a great advantage. Although it must eventually recede behind more sophisticated hierarchical adder structures with logarithmic delays, the actual turning point has been pushed beyond operand widths of 50 bits or more. Thus, many designs can simply rely on the default addition that is so well supported directly by the hardware. This changes in the context of extraordinarily wide operands as they are often found in cryptographic applications. They require designers to identify an appropriate wide adder implementation that is able to meet their design goals. The typical bottleneck imposed by the wide fanin of addition is the achievable clock rate. Various authors have analyzed the performance of the classic fast addition schemes and proposed adder architectures that genuinely blend classic hierarchical approaches with the capabilities of the fast carry chains. This paper presents a survey across these proposals and re-evaluates them in the context of modern FPGA devices.

查看原文本刊更多论文

fpga上宽加法器结构的研究与再评价

二进制单词加法运算是最早需要在自由可编程逻辑设备上使用专用硬件结构的运算之一。由包含加法的两个操作数的大fanin引起的大逻辑深度在sram编程的fpga中尤其有害，其中可配置inter-LUT路由的延迟与连接逻辑级的延迟相比是昂贵的。这些成本已经通过在几乎所有现代FPGA器件上通过线性系列可配置逻辑块建立直接链接的进位链来解决。这种结构非常有效，它使简单的线性加法器布局具有很大的优势。尽管它最终必须落后于具有对数延迟的更复杂的分层加法器结构，但实际的转折点已经超越了50位或更大的操作数宽度。因此，许多设计可以简单地依赖于由硬件直接支持的默认添加。这在非常宽的操作数上下文中发生了变化，因为它们经常出现在加密应用程序中。它们要求设计人员确定能够满足其设计目标的适当的宽加法器实现。大范围的加法带来的典型瓶颈是可实现的时钟速率。许多作者分析了经典快速加法方案的性能，并提出了真正将经典分层方法与快速进位链的能力相结合的加法器体系结构。本文概述了这些建议，并在现代FPGA设备的背景下重新评估它们。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)

自引率

0.00%

发文量