{"title":"High-speed inverse square roots","authors":"M. Schulte, K. Wires","doi":"10.1109/ARITH.1999.762837","DOIUrl":null,"url":null,"abstract":"Inverse square roots are used in several digital signal processing, multimedia, and scientific computing applications. This paper presents a high-speed method for computing inverse square roots. This method uses a table lookup, operand modification, and multiplication to obtain an initial approximation to the inverse square root. This is followed by a modified Newton-Raphson iteration, consisting of one square, one multiply-complement, and one multiply-add operation. The initial approximation and Newton-Raphson iteration employ specialized hardware to reduce the delay, area, and power dissipation. Application of this method is illustrated through the design of an inverse square root unit operands in the IEEE single precision format. An implementation of this unit with a 4-layer metal, 2.5 Volt, 0.25 micron CMOS standard cell library has a cycle rime of 6.7 ns, an area of 0.41 mm/sup 2/, a latency of five cycles, and a throughput of one result per cycle.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"28","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ARITH.1999.762837","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 28
Abstract
Inverse square roots are used in several digital signal processing, multimedia, and scientific computing applications. This paper presents a high-speed method for computing inverse square roots. This method uses a table lookup, operand modification, and multiplication to obtain an initial approximation to the inverse square root. This is followed by a modified Newton-Raphson iteration, consisting of one square, one multiply-complement, and one multiply-add operation. The initial approximation and Newton-Raphson iteration employ specialized hardware to reduce the delay, area, and power dissipation. Application of this method is illustrated through the design of an inverse square root unit operands in the IEEE single precision format. An implementation of this unit with a 4-layer metal, 2.5 Volt, 0.25 micron CMOS standard cell library has a cycle rime of 6.7 ns, an area of 0.41 mm/sup 2/, a latency of five cycles, and a throughput of one result per cycle.