{"title":"An underflow-induced graphics failure solved by SLI arithmetic","authors":"D. Lozier","doi":"10.1109/ARITH.1993.378114","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378114","url":null,"abstract":"Floating-point underflow is often regarded as either harmless or as an indication that the computational algorithm is in need of scaling. A counterexample to this view is given of a function for which contour plotting is difficult due to floating-point underflow. The function arose as an asymptotic solution to a model problem in turbulent combustion in which two chemical species (fuel and oxidizer) mix and react in a vortex field. Scaling is not a viable option because of extreme sensitivity to a small physical parameter. Standard graphics software packages produce erroneous contours without any indication of difficulty. This example provides support for considering symmetric level-index arithmetic, a new form of computer arithmetic which is immune to underflow and overflow.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115712801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Very high radix division with selection by rounding and prescaling","authors":"M. Ercegovac, T. Lang, P. Montuschi","doi":"10.1109/ARITH.1993.378102","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378102","url":null,"abstract":"A division algorithm in which the quotient-digit selection is performed by rounding the shifted residual in carry-save form is presented. To allow the use of this simple function, the divisor (and dividend) is prescaled to a range close to one. The implementation presented results in a fast iteration because of the use of carry-save forms and suitable recodings. The execution time is calculated, and several convenient values of the radix are selected. Comparison with other high-radix dividers is performed using the same assumptions.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116904560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"On squaring and multiplying large integers","authors":"D. Zuras","doi":"10.1109/ARITH.1993.378084","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378084","url":null,"abstract":"Methods of squaring large integers are discussed. The obvious O(n/sup 2/) method turns out to be best for small numbers. The existing /spl ap/ O(n/sup 1.585/) method becomes better as the numbers get bigger. New methods that are /spl ap/ O(n/sup 1.465/) and /spl ap/ O(n/sup 2.404/) are presented. All of these methods can be generalized to multiplication and turn out to be faster than a fast Fourier transform (FFT) multiplication for numbers that can be quite large (>3,000,000 b). Squaring seems to be fundamentally faster than multiplication, but it is shown that T/sub mult/ /spl les/ 2T/sub sq/ + O(n).<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123590807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Fast evaluation of polynomials and inverses of polynomials","authors":"X. Merrheim, J. Muller, Hong-Jin Yeh","doi":"10.1109/ARITH.1993.378093","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378093","url":null,"abstract":"The parallel and online (i.e., digit serial, most significant digit first) evaluation of polynomials and inverses of polynomials is dealt with. New algorithms and architectures are proposed for such evaluations. A 3-D implementation model is presented.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134064770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Floating point Cordic","authors":"G. Hekstra, E. Deprettere","doi":"10.1109/ARITH.1993.378100","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378100","url":null,"abstract":"A full-precision floating-point Cordic algorithm, suitable for the implementation of a word-serial Cordic architecture, is presented. The extension to existing block floating-point Cordic algorithms is in a floating-point representation for the angle. The angle is represented as a combination of exponent, microrotation bits, and two bits to indicate prerotations over /spl pi/2 and /spl pi/ radians. Representing floating-point angles in this fashion maintains the accuracy that is present in the input data, which makes it ideally suited for implementing a floating-point Givens operator.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"293 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132097466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Faster numerical algorithms via exception handling","authors":"J. Demmel, X. Li","doi":"10.1109/ARITH.1993.378087","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378087","url":null,"abstract":"An attractive paradigm for building fast numerical algorithms is the following: (1) try a fast but occasionally unstable algorithm, (2) test the accuracy of the computed answer, and (3) recompute the answer slowly and accurately in the unlikely event it is necessary. This is especially attractive on parallel machines where the fastest algorithms may be less stable than the best serial algorithms. Since unstable algorithms can overflow or cause other exceptions, exception handling is needed to implement this paradigm safely. To implement it efficiently, exception handling cannot be too slow. This paradigm is illustrated with numerical linear algebra algorithms from the LAPACK library.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127967565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Hardware starting approximation for the square root operation","authors":"E. Schwarz, M. Flynn","doi":"10.1109/ARITH.1993.378103","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378103","url":null,"abstract":"A method for obtaining high-precision approximations of high-order arithmetic operations is presented. These approximations provide an accurate starting approximation for high-precision iterative algorithms, which translates into few iterations and a short overall latency. The method uses a partial product array to describe an approximation and sums the array on an existing multiplier. By reusing a multiplier the amount of dedicated hardware is made very small. For the square-root operation, a 16-bit approximation costs less than 1000 dedicated logic gates to implement and has the latency of approximately one multiplication. This is 1/500 the size of an equivalent look-up table method and over twice as many bits of accuracy as an equivalent polynomial method. Thus, a high-precision approximation of the square root operation and many other high-order arithmetic operations is possible at low cost.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"45 7","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114039218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"High-radix modular multiplication for cryptosystems","authors":"Peter Kornerup","doi":"10.1109/ARITH.1993.378082","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378082","url":null,"abstract":"Two algorithms for modular multiplication with very large moduli are analyzed specifically for their applicability when a high radix is used for the multiplier. Both algorithms perform modulo reductions interleaved with the addition of partial products; one algorithm is using the standard residue system, whereas the other utilizes a nonstandard system using reductions modulo a power of the base. The emphasis is on situations, as in cryptosystems, where modular exponentiation is to be realized by many repeated modular multiplications on very large operands, e.g., for cryptosystems with key lengths of 500-1000 b.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117144776","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Design of a fast validated dot product operation","authors":"M. Daumas, D. Matula","doi":"10.1109/ARITH.1993.378108","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378108","url":null,"abstract":"A double precision dot product operation is designed in which the final rounded result is validated by raising exception flags if either the result incurs catastrophic cancellation or the result is not accurate to one unit in the last place (ulp). The design guarantees one ulp accuracy in the absence of catastrophic cancellation. The user can thus obtain validated results at marginal extra cost with the ability to trap to alternative routines in those cases where the results are suspicious.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"144 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115165600","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Division with speculation of quotient digits","authors":"J. Cortadella, T. Lang","doi":"10.1109/ARITH.1993.378105","DOIUrl":"https://doi.org/10.1109/ARITH.1993.378105","url":null,"abstract":"The speed of SRT-type dividers is mainly determined by the complexity of the quotient-digit selection, so that implementations are limited to low-radix stages. A scheme is presented in which the quotient-digit is speculated and, when this speculation is incorrect, a rollback or a partial advance is performed. This results in a division operation with a shorter cycle time and a variable number of cycles. Several designs have been realized, and a radix-64 implementation that is 30% faster than the fastest conventional implementation (radix-8) at an increase of about 45% in area per quotient bit has been obtained. A radix-16 implementation that is about 10% faster than the radix-8 conventional one, with the additional advantage of requiring about 25% less area per quotient bit, is also shown.<<ETX>>","PeriodicalId":414758,"journal":{"name":"Proceedings of IEEE 11th Symposium on Computer Arithmetic","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1993-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114927654","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}