{"title":"Survey on and re-evaluation of wide adder architectures on FPGAs","authors":"Thomas B. Preußer, Markus Krause","doi":"10.1109/ReConFig.2016.7857189","DOIUrl":null,"url":null,"abstract":"The binary word addition was one of the earliest operations that called for special-purpose hardware structures on otherwise freely programmable logic devices. The large logic depth induced by the great fanin that comprises both operands of the addition is especially harmful in SRAM-programmed FPGAs where the delays of the configurable inter-LUT routing are expensive in comparison to the delays of the connected logic stages. These costs have been addressed by carry chains that establish direct links through a linear series of configurable logic blocks on virtually all modern FPGA devices. This structure is so effective that it puts a simple linear adder layout at a great advantage. Although it must eventually recede behind more sophisticated hierarchical adder structures with logarithmic delays, the actual turning point has been pushed beyond operand widths of 50 bits or more. Thus, many designs can simply rely on the default addition that is so well supported directly by the hardware. This changes in the context of extraordinarily wide operands as they are often found in cryptographic applications. They require designers to identify an appropriate wide adder implementation that is able to meet their design goals. The typical bottleneck imposed by the wide fanin of addition is the achievable clock rate. Various authors have analyzed the performance of the classic fast addition schemes and proposed adder architectures that genuinely blend classic hierarchical approaches with the capabilities of the fast carry chains. This paper presents a survey across these proposals and re-evaluates them in the context of modern FPGA devices.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReConFig.2016.7857189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The binary word addition was one of the earliest operations that called for special-purpose hardware structures on otherwise freely programmable logic devices. The large logic depth induced by the great fanin that comprises both operands of the addition is especially harmful in SRAM-programmed FPGAs where the delays of the configurable inter-LUT routing are expensive in comparison to the delays of the connected logic stages. These costs have been addressed by carry chains that establish direct links through a linear series of configurable logic blocks on virtually all modern FPGA devices. This structure is so effective that it puts a simple linear adder layout at a great advantage. Although it must eventually recede behind more sophisticated hierarchical adder structures with logarithmic delays, the actual turning point has been pushed beyond operand widths of 50 bits or more. Thus, many designs can simply rely on the default addition that is so well supported directly by the hardware. This changes in the context of extraordinarily wide operands as they are often found in cryptographic applications. They require designers to identify an appropriate wide adder implementation that is able to meet their design goals. The typical bottleneck imposed by the wide fanin of addition is the achievable clock rate. Various authors have analyzed the performance of the classic fast addition schemes and proposed adder architectures that genuinely blend classic hierarchical approaches with the capabilities of the fast carry chains. This paper presents a survey across these proposals and re-evaluates them in the context of modern FPGA devices.