Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)最新文献

筛选
英文 中文
A reverse converter for the 4-moduli superset {2/sup n/-1, 2/sup n/, 2/sup n/+1, 2/sup n+1/+1} 四模超集{2/sup n/- 1,2 /sup n/, 2/sup n/+ 1,2 /sup n+1/+1}的反向变换器
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762842
M. Bhardwaj, T. Srikanthan, C. Clarke
{"title":"A reverse converter for the 4-moduli superset {2/sup n/-1, 2/sup n/, 2/sup n/+1, 2/sup n+1/+1}","authors":"M. Bhardwaj, T. Srikanthan, C. Clarke","doi":"10.1109/ARITH.1999.762842","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762842","url":null,"abstract":"The authors propose an extension to the popular {2/sup n/-1, 2/sup n/, 2/sup n/+1} moduli set by adding a fourth modulus \"2/sup n+1/+1. This extension leads to higher parallelism while keeping the forward conversion and modular arithmetic units simple. The main challenge of efficient reverse conversion is met by three techniques described for the first time. Firstly, we reverse convert linear combinations of moduli hence reducing the number of non-zero bits in the Booth encoded multiplicands from n to merely 2. Secondly, it is shown that division by 3, if introduced at the right stage, can be implemented very efficiently and can, in turn, reduce the cost of the converter. To implement VLSI efficient modulo reduction, we propose two techniques-multiple split tables (MST) and a modified division algorithm (MDA). It is shown that the MST can reduce exponential ROM requirements to quadratic ROM requirements while the MDA can reduce these further to linear requirements. As a result of these innovations, the proposed reverse converter uses simple shift and add operations and needs a lookup with only 6 entries. The delay of the converter is approximately 10n+13 full adder delays and the area cost is quadratic in n.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"53 23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124650470","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 47
Correctness proofs outline for Newton-Raphson based floating-point divide and square root algorithms 基于牛顿-拉夫森浮点除法和平方根算法的正确性证明大纲
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762834
Marius A. Cornea-Hasegan, Roger A. Golliver, Peter W. Markstein
{"title":"Correctness proofs outline for Newton-Raphson based floating-point divide and square root algorithms","authors":"Marius A. Cornea-Hasegan, Roger A. Golliver, Peter W. Markstein","doi":"10.1109/ARITH.1999.762834","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762834","url":null,"abstract":"This paper describes a study of a class of algorithms for the floating-point divide and square root operations, based on the Newton-Raphson iterative method. The two main goals were. (1) Proving the IEEE correctness of these iterative floating-point algorithms, i.e. compliance with the IEEE-754 standard for binary floating-point operations. The focus was on software driven iterative algorithms, instead of the hardware based implementations that dominated until now. (2) Identifying the special cases of operands that require results. Assistance due to possible overflow, or loss of precision of intermediate This study was initiated in an attempt to prove the IEEE for a class of divide and square root based on the Newton-Rapshson iterative methods. As more insight into the inner workings of these algorithms was gained, it became obvious that a formal study and proof were necessary in order to achieve the desired objectives. The result is a complete and rigorous proof of IEEE correctness for floating-point divide and square root algorithms based on the Newton-Raphson iterative method. Even more, the method used in proving the IEEE correctness of the square root algorithm is applicable in principle to any iterative algorithm, not only based on the Newton-Raphson method. Conditions requiring Software Assistance (SWA) were also determined, and were used to identify cases when alternate algorithms are needed to generate correct results. Overall, this is one important step toward flawless implementation of these floating-point operations based on software implementations.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128125680","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 62
On infinitely precise rounding for division, square root, reciprocal and square root reciprocal 关于除法、平方根、倒数和平方根倒数的无限精确舍入
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762849
Cristina Iordache, D. Matula
{"title":"On infinitely precise rounding for division, square root, reciprocal and square root reciprocal","authors":"Cristina Iordache, D. Matula","doi":"10.1109/ARITH.1999.762849","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762849","url":null,"abstract":"Quotients, reciprocals, square roots and square root reciprocals all have the property that infinitely precise p-bit rounded results for p-bit input operands can be obtained from approximate results of bounded accuracy. We investigate lower bounds on the number of bits of an approximation accurate to a unit in the last place sufficient to guarantee that correct round and sticky bits can be determined. Known lower bounds for quotients and square roots are given and/or sharpened, and a new lower bound for root reciprocals is proved. Specifically for reciprocals, quotients and square roots, tight bounds of order 2p+O(1) are presented. For infinitely precise rounding of the root reciprocal, a lower bound can be found at 3p+O(1), but exhaustive testing for small sizes of the operand suggests that in practice (2+/spl epsiv/)p for small /spl epsiv/ is usually sufficient. Algorithms can be designed for obtaining the round and sticky bits based on the bit pattern of an approximation computed to the required accuracy. We show that some improvement of the known lower bound for reciprocals and division is achievable at the cost of somewhat more complex hardware for rounding. Tests for the exactness of the quotient and square root are also provided.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132284458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 41
Series approximation methods for divide and square root in the Power3/sup TM/ processor Power3/sup TM/处理器中除法和平方根的级数逼近方法
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762836
M. Schmookler, R. Agarwal, F. Gustavson
{"title":"Series approximation methods for divide and square root in the Power3/sup TM/ processor","authors":"M. Schmookler, R. Agarwal, F. Gustavson","doi":"10.1109/ARITH.1999.762836","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762836","url":null,"abstract":"The Power3 processor is a 64-bit implementation of the PowerPC/sup TM/ architecture and is the successor to the Power2/sup TM/ processor for workstations and servers which require high performance floating point capability. The previous processors used Newton-Raphson algorithms for their implementations of divide and square root. The Power3 processor has a longer pipeline latency, which would substantially increase the latency for these instructions. Instead, new algorithms based on power series approximations were developed which provide significantly better performance than the Newton-Raphson algorithm for this processor. This paper describes the algorithms, and then shows how both the series based algorithms and the Newton-Raphson algorithms are affected by pipeline length. For the Power3, the power series algorithms reduce the divide latency by over 20% and the square root latency by 35%.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124463274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Digit-recurrence algorithm for computing Euclidean norm of a 3-D vector 计算三维矢量欧几里得范数的数字递归算法
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762833
N. Takagi, S. Kuwahara
{"title":"Digit-recurrence algorithm for computing Euclidean norm of a 3-D vector","authors":"N. Takagi, S. Kuwahara","doi":"10.1109/ARITH.1999.762833","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762833","url":null,"abstract":"A digit-recurrence algorithm for computing the Euclidean norm of a 3-dimensional vector is proposed. Starting from the vector component with the highest order of magnitude as the initial value of partial result, correcting-digits produced by the recurrence are added to it step by step. Partial products of the squares of the other two components are added to the residual, step by step. The addition/subtractions in the recurrence are performed without carry/borrow propagation by the use of a redundant representation of the residual. An extension of the on-the-fly conversion algorithm is used for updating the partial result. Different specific versions of the algorithm are possible, depending on the radix, the redundancy factor of the correcting-digit set, the type of representation of the residual, and the correcting-digit selection function.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130762339","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Interval sine and cosine functions computation based on variable-precision CORDIC algorithm 基于变精度CORDIC算法的区间正弦余弦函数计算
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762844
J. Hormigo, J. Villalba, E. Zapata
{"title":"Interval sine and cosine functions computation based on variable-precision CORDIC algorithm","authors":"J. Hormigo, J. Villalba, E. Zapata","doi":"10.1109/ARITH.1999.762844","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762844","url":null,"abstract":"In this paper we design a CORDIC architecture for variable-precision, and a new algorithm is proposed to perform the interval sine and cosine functions. This system allows us to specify the precision to perform the sine and cosine functions, and control the accuracy of the result, in such a way that recomputation of inaccurate results can be carried out with higher precision. An important reduction in the number of iterations is obtained by taking advantage of the differential angle, and the number of cycles per iteration is reduced by avoiding the additions of the leading all zero words. As a consequence, the computation time of the interval function evaluation obtained is close to that of a point function evaluation. The problem of the large table of angles and the scale factor compensation involved in a high precision CORDIC has been solved efficiently.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"337 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124308746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Number-theoretic test generation for directed rounding 有向舍入的数论测试生成
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762850
Michael Parks
{"title":"Number-theoretic test generation for directed rounding","authors":"Michael Parks","doi":"10.1109/ARITH.1999.762850","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762850","url":null,"abstract":"We present methods to generate systematically the hardest test cases for multiplication, division, and square root subject to directed rounding, essentially extending previous work on number-theoretic floating point testing to rounding modes other than to-nearest. The algorithms focus upon the rounding boundaries of the modes truncate, to-minus infinity, and to-infinity and programs based on them require little beyond exact arithmetic in the working precision to create billions of edge cases. We show that the amount of work required to calculate trial multiplicands pays off in the form of free extra tests due to an interconnection among the operations considered herein. Although these tests do not replace proofs of correctness, they can be used to gain a high degree of confidence that the accuracy requirements as mandated by IEEE Standard 754 have been satisfied.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133972821","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
High-speed inverse square roots 高速反平方根
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762837
M. Schulte, K. Wires
{"title":"High-speed inverse square roots","authors":"M. Schulte, K. Wires","doi":"10.1109/ARITH.1999.762837","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762837","url":null,"abstract":"Inverse square roots are used in several digital signal processing, multimedia, and scientific computing applications. This paper presents a high-speed method for computing inverse square roots. This method uses a table lookup, operand modification, and multiplication to obtain an initial approximation to the inverse square root. This is followed by a modified Newton-Raphson iteration, consisting of one square, one multiply-complement, and one multiply-add operation. The initial approximation and Newton-Raphson iteration employ specialized hardware to reduce the delay, area, and power dissipation. Application of this method is illustrated through the design of an inverse square root unit operands in the IEEE single precision format. An implementation of this unit with a 4-layer metal, 2.5 Volt, 0.25 micron CMOS standard cell library has a cycle rime of 6.7 ns, an area of 0.41 mm/sup 2/, a latency of five cycles, and a throughput of one result per cycle.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133808177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Moduli for testing implementations of the RSA cryptosystem 模用于测试RSA密码系统的实现
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762832
C. D. Walter
{"title":"Moduli for testing implementations of the RSA cryptosystem","authors":"C. D. Walter","doi":"10.1109/ARITH.1999.762832","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762832","url":null,"abstract":"Comprehensive testing of any implementation of the RSA cryptosystem requires the use of a number of moduli with specific properties. It is shown how to generate a sufficient variety of these to enable testing which will justify high confidence in the correctness of both the design and the operation of hardware implementations. The tests avoid the necessity of another implementation for comparison. Many of these moduli are also suitable for testing software implementations. Furthermore, the methods apply equally well to other similar modular arithmetic based cryptosystems which use exponentiation, such as Diffie-Helman key exchange.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131325507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
A low-power, high-speed implementation of a PowerPC/sup TM/ microprocessor vector extension 一个低功耗,高速实现的PowerPC/sup TM/微处理器矢量扩展
Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336) Pub Date : 1999-04-14 DOI: 10.1109/ARITH.1999.762823
M. Schmookler, M. Putrino, Anh Mather, J. Tyler, Huy Nguyen, C. Roth, Mukesh Sharma, M. Pham, Jeff Lent
{"title":"A low-power, high-speed implementation of a PowerPC/sup TM/ microprocessor vector extension","authors":"M. Schmookler, M. Putrino, Anh Mather, J. Tyler, Huy Nguyen, C. Roth, Mukesh Sharma, M. Pham, Jeff Lent","doi":"10.1109/ARITH.1999.762823","DOIUrl":"https://doi.org/10.1109/ARITH.1999.762823","url":null,"abstract":"The AltiVec/sup TM/ technology is an extension to the PowerPC architecture/sup TM/ which provides new computational and storage operations for handling vectors of various data lengths and data types. The first implementation using this technology is a low-cost, low-power processor based on the acclaimed PowerPC 750/sup TM/ microprocessor. This paper describes the microarchitecture and design of the vector arithmetic unit of this implementation.","PeriodicalId":434169,"journal":{"name":"Proceedings 14th IEEE Symposium on Computer Arithmetic (Cat. No.99CB36336)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1999-04-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123447077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信