{"title":"硬浮点和dsp fpga的单精度自然对数体系结构","authors":"M. Langhammer, B. Pasca","doi":"10.1109/ARITH.2016.20","DOIUrl":null,"url":null,"abstract":"In this paper we will present a novel method for implementing floating point (FP) elementary functions using the new FP single precision addition and multiplication features of the Altera Arria~10 DSP Block architecture. Our application example will use log(x), one of the most commonly required functions for emerging datacenter and computing FPGA targets. We will explain why the combination of new FPGA technology, and at the same time, a massive increase in computing performance requirement, fuels the need for this work. We show a comprehensive error analysis, both for the overall function, and each subsection of the architecture, demonstrating that the hard FP (HFP) Blocks, in conjunction with the traditional flexibility and connectivity of the FPGA, can provide a robust and high performance solution. These methods create a highly accurate single precision IEEE754 function, which is OpenCL conformant. Our methods map directly to almost exclusively embedded structures, and therefore result in significant reduction in logic resources and routing stress compared to current methods, and demonstrate that newly introduced FPGA routing architectures can be leveraged to use almost no soft resources. We also show that the latency of the log(x) function can be changed independently of the architecture and function, allowing the performance of the function to be adjusted directly to the system clock rate.","PeriodicalId":145448,"journal":{"name":"2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"Single Precision Natural Logarithm Architecture for Hard Floating-Point and DSP-Enabled FPGAs\",\"authors\":\"M. Langhammer, B. Pasca\",\"doi\":\"10.1109/ARITH.2016.20\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we will present a novel method for implementing floating point (FP) elementary functions using the new FP single precision addition and multiplication features of the Altera Arria~10 DSP Block architecture. Our application example will use log(x), one of the most commonly required functions for emerging datacenter and computing FPGA targets. We will explain why the combination of new FPGA technology, and at the same time, a massive increase in computing performance requirement, fuels the need for this work. We show a comprehensive error analysis, both for the overall function, and each subsection of the architecture, demonstrating that the hard FP (HFP) Blocks, in conjunction with the traditional flexibility and connectivity of the FPGA, can provide a robust and high performance solution. These methods create a highly accurate single precision IEEE754 function, which is OpenCL conformant. Our methods map directly to almost exclusively embedded structures, and therefore result in significant reduction in logic resources and routing stress compared to current methods, and demonstrate that newly introduced FPGA routing architectures can be leveraged to use almost no soft resources. We also show that the latency of the log(x) function can be changed independently of the architecture and function, allowing the performance of the function to be adjusted directly to the system clock rate.\",\"PeriodicalId\":145448,\"journal\":{\"name\":\"2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ARITH.2016.20\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 IEEE 23nd Symposium on Computer Arithmetic (ARITH)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.2016.20","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Single Precision Natural Logarithm Architecture for Hard Floating-Point and DSP-Enabled FPGAs
In this paper we will present a novel method for implementing floating point (FP) elementary functions using the new FP single precision addition and multiplication features of the Altera Arria~10 DSP Block architecture. Our application example will use log(x), one of the most commonly required functions for emerging datacenter and computing FPGA targets. We will explain why the combination of new FPGA technology, and at the same time, a massive increase in computing performance requirement, fuels the need for this work. We show a comprehensive error analysis, both for the overall function, and each subsection of the architecture, demonstrating that the hard FP (HFP) Blocks, in conjunction with the traditional flexibility and connectivity of the FPGA, can provide a robust and high performance solution. These methods create a highly accurate single precision IEEE754 function, which is OpenCL conformant. Our methods map directly to almost exclusively embedded structures, and therefore result in significant reduction in logic resources and routing stress compared to current methods, and demonstrate that newly introduced FPGA routing architectures can be leveraged to use almost no soft resources. We also show that the latency of the log(x) function can be changed independently of the architecture and function, allowing the performance of the function to be adjusted directly to the system clock rate.