fpga上宽加法器结构的研究与再评价

Thomas B. Preußer, Markus Krause
{"title":"fpga上宽加法器结构的研究与再评价","authors":"Thomas B. Preußer, Markus Krause","doi":"10.1109/ReConFig.2016.7857189","DOIUrl":null,"url":null,"abstract":"The binary word addition was one of the earliest operations that called for special-purpose hardware structures on otherwise freely programmable logic devices. The large logic depth induced by the great fanin that comprises both operands of the addition is especially harmful in SRAM-programmed FPGAs where the delays of the configurable inter-LUT routing are expensive in comparison to the delays of the connected logic stages. These costs have been addressed by carry chains that establish direct links through a linear series of configurable logic blocks on virtually all modern FPGA devices. This structure is so effective that it puts a simple linear adder layout at a great advantage. Although it must eventually recede behind more sophisticated hierarchical adder structures with logarithmic delays, the actual turning point has been pushed beyond operand widths of 50 bits or more. Thus, many designs can simply rely on the default addition that is so well supported directly by the hardware. This changes in the context of extraordinarily wide operands as they are often found in cryptographic applications. They require designers to identify an appropriate wide adder implementation that is able to meet their design goals. The typical bottleneck imposed by the wide fanin of addition is the achievable clock rate. Various authors have analyzed the performance of the classic fast addition schemes and proposed adder architectures that genuinely blend classic hierarchical approaches with the capabilities of the fast carry chains. This paper presents a survey across these proposals and re-evaluates them in the context of modern FPGA devices.","PeriodicalId":431909,"journal":{"name":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2016-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Survey on and re-evaluation of wide adder architectures on FPGAs\",\"authors\":\"Thomas B. Preußer, Markus Krause\",\"doi\":\"10.1109/ReConFig.2016.7857189\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The binary word addition was one of the earliest operations that called for special-purpose hardware structures on otherwise freely programmable logic devices. The large logic depth induced by the great fanin that comprises both operands of the addition is especially harmful in SRAM-programmed FPGAs where the delays of the configurable inter-LUT routing are expensive in comparison to the delays of the connected logic stages. These costs have been addressed by carry chains that establish direct links through a linear series of configurable logic blocks on virtually all modern FPGA devices. This structure is so effective that it puts a simple linear adder layout at a great advantage. Although it must eventually recede behind more sophisticated hierarchical adder structures with logarithmic delays, the actual turning point has been pushed beyond operand widths of 50 bits or more. Thus, many designs can simply rely on the default addition that is so well supported directly by the hardware. This changes in the context of extraordinarily wide operands as they are often found in cryptographic applications. They require designers to identify an appropriate wide adder implementation that is able to meet their design goals. The typical bottleneck imposed by the wide fanin of addition is the achievable clock rate. Various authors have analyzed the performance of the classic fast addition schemes and proposed adder architectures that genuinely blend classic hierarchical approaches with the capabilities of the fast carry chains. This paper presents a survey across these proposals and re-evaluates them in the context of modern FPGA devices.\",\"PeriodicalId\":431909,\"journal\":{\"name\":\"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ReConFig.2016.7857189\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 International Conference on ReConFigurable Computing and FPGAs (ReConFig)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ReConFig.2016.7857189","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

二进制单词加法运算是最早需要在自由可编程逻辑设备上使用专用硬件结构的运算之一。由包含加法的两个操作数的大fanin引起的大逻辑深度在sram编程的fpga中尤其有害,其中可配置inter-LUT路由的延迟与连接逻辑级的延迟相比是昂贵的。这些成本已经通过在几乎所有现代FPGA器件上通过线性系列可配置逻辑块建立直接链接的进位链来解决。这种结构非常有效,它使简单的线性加法器布局具有很大的优势。尽管它最终必须落后于具有对数延迟的更复杂的分层加法器结构,但实际的转折点已经超越了50位或更大的操作数宽度。因此,许多设计可以简单地依赖于由硬件直接支持的默认添加。这在非常宽的操作数上下文中发生了变化,因为它们经常出现在加密应用程序中。它们要求设计人员确定能够满足其设计目标的适当的宽加法器实现。大范围的加法带来的典型瓶颈是可实现的时钟速率。许多作者分析了经典快速加法方案的性能,并提出了真正将经典分层方法与快速进位链的能力相结合的加法器体系结构。本文概述了这些建议,并在现代FPGA设备的背景下重新评估它们。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Survey on and re-evaluation of wide adder architectures on FPGAs
The binary word addition was one of the earliest operations that called for special-purpose hardware structures on otherwise freely programmable logic devices. The large logic depth induced by the great fanin that comprises both operands of the addition is especially harmful in SRAM-programmed FPGAs where the delays of the configurable inter-LUT routing are expensive in comparison to the delays of the connected logic stages. These costs have been addressed by carry chains that establish direct links through a linear series of configurable logic blocks on virtually all modern FPGA devices. This structure is so effective that it puts a simple linear adder layout at a great advantage. Although it must eventually recede behind more sophisticated hierarchical adder structures with logarithmic delays, the actual turning point has been pushed beyond operand widths of 50 bits or more. Thus, many designs can simply rely on the default addition that is so well supported directly by the hardware. This changes in the context of extraordinarily wide operands as they are often found in cryptographic applications. They require designers to identify an appropriate wide adder implementation that is able to meet their design goals. The typical bottleneck imposed by the wide fanin of addition is the achievable clock rate. Various authors have analyzed the performance of the classic fast addition schemes and proposed adder architectures that genuinely blend classic hierarchical approaches with the capabilities of the fast carry chains. This paper presents a survey across these proposals and re-evaluates them in the context of modern FPGA devices.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信