浮点融合乘加:减少浮点加法的延迟

J. Bruguera, T. Lang
{"title":"浮点融合乘加:减少浮点加法的延迟","authors":"J. Bruguera, T. Lang","doi":"10.1109/ARITH.2005.22","DOIUrl":null,"url":null,"abstract":"In this paper we propose an architecture for the computation of the double-precision floating-point multiply-add fused (MAF) operation A+(B/spl times/C) that permits to compute the floating-point addition with lower latency than floating-point multiplication and MAF. While previous MAF architectures compute the three operations with the same latency, the proposed architecture permits to skip the first pipeline stages, those related with the multiplication B/spl times/C, in case of an addition. For instance, for a MAF unit pipelined into three or five stages, the latency of the floating-point addition is reduced to two or three cycles, respectively. To achieve the latency reduction for floating-point addition, the alignment shifter, which in previous organizations is in parallel with the multiplication, is moved so that the multiplication can be bypassed. To avoid that this modification increases the critical path, a double-datapath organization is used, in which the alignment and normalization are in separate paths. Moreover, we use the techniques developed previously of combining the addition and the rounding and of performing the normalization before the addition.","PeriodicalId":194902,"journal":{"name":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2005-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"57","resultStr":"{\"title\":\"Floating-point fused multiply-add: reduced latency for floating-point addition\",\"authors\":\"J. Bruguera, T. Lang\",\"doi\":\"10.1109/ARITH.2005.22\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we propose an architecture for the computation of the double-precision floating-point multiply-add fused (MAF) operation A+(B/spl times/C) that permits to compute the floating-point addition with lower latency than floating-point multiplication and MAF. While previous MAF architectures compute the three operations with the same latency, the proposed architecture permits to skip the first pipeline stages, those related with the multiplication B/spl times/C, in case of an addition. For instance, for a MAF unit pipelined into three or five stages, the latency of the floating-point addition is reduced to two or three cycles, respectively. To achieve the latency reduction for floating-point addition, the alignment shifter, which in previous organizations is in parallel with the multiplication, is moved so that the multiplication can be bypassed. To avoid that this modification increases the critical path, a double-datapath organization is used, in which the alignment and normalization are in separate paths. Moreover, we use the techniques developed previously of combining the addition and the rounding and of performing the normalization before the addition.\",\"PeriodicalId\":194902,\"journal\":{\"name\":\"17th IEEE Symposium on Computer Arithmetic (ARITH'05)\",\"volume\":\"24 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2005-06-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"57\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"17th IEEE Symposium on Computer Arithmetic (ARITH'05)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARITH.2005.22\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"17th IEEE Symposium on Computer Arithmetic (ARITH'05)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.2005.22","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 57

摘要

本文提出了一种双精度浮点乘加融合(MAF)运算A+(B/spl times/C)的计算架构,该架构允许以比浮点乘法和MAF更低的延迟计算浮点加法。虽然以前的MAF体系结构以相同的延迟计算这三个操作,但提议的体系结构允许跳过第一个管道阶段,即与乘法B/spl /C相关的阶段,以便进行添加。例如,对于一个MAF单元,流水线分为三个或五个阶段,浮点加法的延迟分别减少到两个或三个周期。为了减少浮点加法的延迟,需要移动对齐移位器(在以前的组织中与乘法并行),以便可以绕过乘法。为了避免这种修改增加关键路径,使用了双数据路径组织,其中对齐和规范化位于不同的路径中。此外,我们使用了先前开发的将加法和舍入相结合以及在加法之前执行规范化的技术。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Floating-point fused multiply-add: reduced latency for floating-point addition
In this paper we propose an architecture for the computation of the double-precision floating-point multiply-add fused (MAF) operation A+(B/spl times/C) that permits to compute the floating-point addition with lower latency than floating-point multiplication and MAF. While previous MAF architectures compute the three operations with the same latency, the proposed architecture permits to skip the first pipeline stages, those related with the multiplication B/spl times/C, in case of an addition. For instance, for a MAF unit pipelined into three or five stages, the latency of the floating-point addition is reduced to two or three cycles, respectively. To achieve the latency reduction for floating-point addition, the alignment shifter, which in previous organizations is in parallel with the multiplication, is moved so that the multiplication can be bypassed. To avoid that this modification increases the critical path, a double-datapath organization is used, in which the alignment and normalization are in separate paths. Moreover, we use the techniques developed previously of combining the addition and the rounding and of performing the normalization before the addition.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信