{"title":"“快速平方根反比”算法的硬件架构设计与映射","authors":"Saad Zafar, Raviteja Adapa","doi":"10.1109/ICAEE.2014.6838433","DOIUrl":null,"url":null,"abstract":"The Fast Inverse Square Root algorithm has been used in 3D games of past for lighting and reflection calculations, because it offers up to four times performance gains. This paper presents a hardware implementation of the algorithm on an FPGA board by designing the complete architecture and successfully mapping it on Xilinx Spartan 3E after thorough functional verification. The results show that this implementation provides a very efficient single-precision floating point inverse square root calculator with practically accurate results being made available after just 12 short clock cycles. This performance measure is far superior to the software counterpart of the algorithm, and is not processor dependent like rsqrtss of x86 SSE instruction set. Results of this work can aid FPGA based vector processors or graphic processing units with 3D rendering. The hardware design can also form part of a larger floating point arithmetic unit for dedicated reciprocal square root calculations.","PeriodicalId":151739,"journal":{"name":"2014 International Conference on Advances in Electrical Engineering (ICAEE)","volume":"53 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Hardware architecture design and mapping of ‘Fast Inverse Square Root’ algorithm\",\"authors\":\"Saad Zafar, Raviteja Adapa\",\"doi\":\"10.1109/ICAEE.2014.6838433\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Fast Inverse Square Root algorithm has been used in 3D games of past for lighting and reflection calculations, because it offers up to four times performance gains. This paper presents a hardware implementation of the algorithm on an FPGA board by designing the complete architecture and successfully mapping it on Xilinx Spartan 3E after thorough functional verification. The results show that this implementation provides a very efficient single-precision floating point inverse square root calculator with practically accurate results being made available after just 12 short clock cycles. This performance measure is far superior to the software counterpart of the algorithm, and is not processor dependent like rsqrtss of x86 SSE instruction set. Results of this work can aid FPGA based vector processors or graphic processing units with 3D rendering. The hardware design can also form part of a larger floating point arithmetic unit for dedicated reciprocal square root calculations.\",\"PeriodicalId\":151739,\"journal\":{\"name\":\"2014 International Conference on Advances in Electrical Engineering (ICAEE)\",\"volume\":\"53 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 International Conference on Advances in Electrical Engineering (ICAEE)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICAEE.2014.6838433\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 International Conference on Advances in Electrical Engineering (ICAEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAEE.2014.6838433","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
摘要
Fast Inverse Square Root算法在过去的3D游戏中用于照明和反射计算,因为它提供了高达四倍的性能提升。本文给出了该算法在FPGA板上的硬件实现,设计了完整的架构,并经过全面的功能验证,成功地将其映射到Xilinx Spartan 3E上。结果表明,该实现提供了一个非常高效的单精度浮点平方根反计算器,只需12个短时钟周期即可获得几乎准确的结果。这种性能度量远远优于该算法的软件对应,并且不像x86 SSE指令集的rsqrtss那样依赖于处理器。这项工作的结果可以帮助基于FPGA的矢量处理器或图形处理单元进行3D渲染。硬件设计也可以构成一个更大的浮点运算单元的一部分,用于专用的倒数平方根计算。
Hardware architecture design and mapping of ‘Fast Inverse Square Root’ algorithm
The Fast Inverse Square Root algorithm has been used in 3D games of past for lighting and reflection calculations, because it offers up to four times performance gains. This paper presents a hardware implementation of the algorithm on an FPGA board by designing the complete architecture and successfully mapping it on Xilinx Spartan 3E after thorough functional verification. The results show that this implementation provides a very efficient single-precision floating point inverse square root calculator with practically accurate results being made available after just 12 short clock cycles. This performance measure is far superior to the software counterpart of the algorithm, and is not processor dependent like rsqrtss of x86 SSE instruction set. Results of this work can aid FPGA based vector processors or graphic processing units with 3D rendering. The hardware design can also form part of a larger floating point arithmetic unit for dedicated reciprocal square root calculations.