fpga中忠实舍入浮点函数逼近的通用方法

David B. Thomas
{"title":"fpga中忠实舍入浮点函数逼近的通用方法","authors":"David B. Thomas","doi":"10.1109/ARITH.2015.27","DOIUrl":null,"url":null,"abstract":"A barrier to wide-spread use of Field Programmable Gate Arrays (FPGAs) has been the complexity of programming, but recent advances in High-Level Synthesis (HLS) have made it possible for non-experts to easily create floating-point numerical accelerators from C-like code. However, HLS users are limited to the set of numerical primitives provided by HLS vendors and designers of floating-point IP cores, and cannot easily implement new fast or accurate numerical primitives. This paper presents a method for automatically creating high-performance pipelined floating-point function approximations, which can be integrated as IP cores into numerical accelerators, whether derived from HLS or traditional design methods. Both input and output are floating-point, but internally the function approximator uses fixed-point polynomial segments, guaranteeing a faithfully rounded output. A robust and automated non-uniform segmentation scheme is used to segment any twice-differentiable input function and produce platform-independent VHDL. The approach is demonstrated across ten functions, which are automatically generated then placed and routed in Xilinx devices. The method provides a 1.1x-3x improvement in area over composite numerical approximations, while providing similar performance and significantly better relative error.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"13 1","pages":"42-49"},"PeriodicalIF":0.0000,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"A General-Purpose Method for Faithfully Rounded Floating-Point Function Approximation in FPGAs\",\"authors\":\"David B. Thomas\",\"doi\":\"10.1109/ARITH.2015.27\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A barrier to wide-spread use of Field Programmable Gate Arrays (FPGAs) has been the complexity of programming, but recent advances in High-Level Synthesis (HLS) have made it possible for non-experts to easily create floating-point numerical accelerators from C-like code. However, HLS users are limited to the set of numerical primitives provided by HLS vendors and designers of floating-point IP cores, and cannot easily implement new fast or accurate numerical primitives. This paper presents a method for automatically creating high-performance pipelined floating-point function approximations, which can be integrated as IP cores into numerical accelerators, whether derived from HLS or traditional design methods. Both input and output are floating-point, but internally the function approximator uses fixed-point polynomial segments, guaranteeing a faithfully rounded output. A robust and automated non-uniform segmentation scheme is used to segment any twice-differentiable input function and produce platform-independent VHDL. The approach is demonstrated across ten functions, which are automatically generated then placed and routed in Xilinx devices. The method provides a 1.1x-3x improvement in area over composite numerical approximations, while providing similar performance and significantly better relative error.\",\"PeriodicalId\":6526,\"journal\":{\"name\":\"2015 IEEE 22nd Symposium on Computer Arithmetic\",\"volume\":\"13 1\",\"pages\":\"42-49\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-06-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE 22nd Symposium on Computer Arithmetic\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARITH.2015.27\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE 22nd Symposium on Computer Arithmetic","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.2015.27","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15

摘要

广泛使用现场可编程门阵列(fpga)的一个障碍是编程的复杂性,但最近在高级综合(HLS)方面的进展使得非专业人员可以很容易地从类c代码创建浮点数值加速器。然而,HLS用户受到HLS供应商和浮点IP核设计人员提供的一组数字原语的限制,无法轻松实现新的快速或精确的数字原语。本文提出了一种自动创建高性能流水线浮点函数近似的方法,该方法可以作为IP核集成到数值加速器中,无论是源自HLS还是传统设计方法。输入和输出都是浮点数,但函数近似器内部使用定点多项式段,保证忠实地四舍五入输出。采用鲁棒、自动化的非均匀分割方案对任意二次可微输入函数进行分割,生成与平台无关的VHDL。该方法演示了十个功能,这些功能自动生成,然后在Xilinx设备中放置和路由。与复合数值近似相比,该方法的面积提高了1.1 -3倍,同时提供了相似的性能和明显更好的相对误差。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
A General-Purpose Method for Faithfully Rounded Floating-Point Function Approximation in FPGAs
A barrier to wide-spread use of Field Programmable Gate Arrays (FPGAs) has been the complexity of programming, but recent advances in High-Level Synthesis (HLS) have made it possible for non-experts to easily create floating-point numerical accelerators from C-like code. However, HLS users are limited to the set of numerical primitives provided by HLS vendors and designers of floating-point IP cores, and cannot easily implement new fast or accurate numerical primitives. This paper presents a method for automatically creating high-performance pipelined floating-point function approximations, which can be integrated as IP cores into numerical accelerators, whether derived from HLS or traditional design methods. Both input and output are floating-point, but internally the function approximator uses fixed-point polynomial segments, guaranteeing a faithfully rounded output. A robust and automated non-uniform segmentation scheme is used to segment any twice-differentiable input function and produce platform-independent VHDL. The approach is demonstrated across ten functions, which are automatically generated then placed and routed in Xilinx devices. The method provides a 1.1x-3x improvement in area over composite numerical approximations, while providing similar performance and significantly better relative error.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信