{"title":"Digit Recurrence Floating-Point Division under HUB Format","authors":"J. Villalba","doi":"10.1109/ARITH.2016.17","DOIUrl":"https://doi.org/10.1109/ARITH.2016.17","url":null,"abstract":"Half-Unit-Biased format is based on shifting the representation line of the binary numbers by half Unit in the Last Place. The main feature of this format is that the roundto-nearest is carried out by a simple truncation, preventing any carry propagation and saving time and area. Algorithms and architectures have been defined for addition/substraction and multiplication operations under this format. Nevertheless, the division operation has not been confronted yet. In this paper we deal with the floating-point division under HUB format, studying the architecture for the digit recurrence method, including the on-the-fly conversion of the signed digit quotient. Keywords—division by digit recurrence, HUB format, on-the-fly conversion","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"1 1","pages":"79-86"},"PeriodicalIF":0.0,"publicationDate":"2016-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79860050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Contributions to the Design of Residue Number System Architectures","authors":"Benoît Gérard, J. Kammerer, Nabil Merkiche","doi":"10.1109/ARITH.2015.25","DOIUrl":"https://doi.org/10.1109/ARITH.2015.25","url":null,"abstract":"Residue Number System (RNS) is nowadays considered as a real alternative to other hardware architectures for handling large-number computations. In this paper we propose algorithmic answers to some of the questions that may face a designer when implementing such solution. More precisely, we investigated the following three problems. First, we propose an efficient method for constructing maximal bases noticing that this problem can be seen as a max-clique problem. Second we consider the logical gates count reduction when two different bases share the same hardware modules. Again it is linked to graph theory since it corresponds to finding a maximum weighted matching. Eventually we detail how the presence of DSP blocks in FPGAs can be leveraged to reach higher design frequencies by implementing full computation units inside.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"21 1","pages":"105-112"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73443497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Calculating in floating sexagesimal place value notation, 4000 years ago","authors":"C. Proust","doi":"10.1109/ARITH.2015.33","DOIUrl":"https://doi.org/10.1109/ARITH.2015.33","url":null,"abstract":"Summary form only given, as follows. The full paper was not made available as part of this conference proceedings. By the end of the third millennium BCE in Mesopotamia an innovation of major significance for the history of mathematics occurred: the sexagesimal place value notation. A sophisticated mathematical culture was subsequently developed by masters attached to the scribal schools that flourished in Iraq, Iran and Syria during the first centuries of the second millenium BCE. The best known aspect of this mathematical culture is the art of solving quadratic problems. The numerical algorithms exploiting the properties of base 60 and the floating notation are less known. This paper presents some of these algorithms, especially those based on factorization methods.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"187 1","pages":"1"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77264004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"New Bit-Level Serial GF (2^m) Multiplication Using Polynomial Basis","authors":"H. El-Razouk, A. Reyhani-Masoleh","doi":"10.1109/ARITH.2015.11","DOIUrl":"https://doi.org/10.1109/ARITH.2015.11","url":null,"abstract":"The Polynomial basis (PB) representation offers efficient hardware realizations of GF(2m) multipliers. Bit-level serial multiplication over GF(2m) trades-off the computational latency for lower silicon area, and hence, is favored in resource constrained applications. In such area critical applications, extra clock cycles might take place to read the inputs of the multiplication if the data-path has limited capacity. In this paper, we present a new bit-level serial PB multiplication scheme which generates its output bits in parallel after m clock cycles without requiring any preloading of the inputs, for the first time in the open literature. The proposed architecture, referred to as fully-serial-in-parallel-out (FSIPO), is useful for achieving higher throughput in resource constrained environments if the data-path for entering inputs has limited capacity, especially, for large dimensions of the field GF (2m).","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"1 1","pages":"129-136"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89585444","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Efficient Softcore Multiplier Architecture for Xilinx FPGAs","authors":"M. Kumm, Shahid Abbas, P. Zipf","doi":"10.1109/ARITH.2015.17","DOIUrl":"https://doi.org/10.1109/ARITH.2015.17","url":null,"abstract":"This work presents an efficient implementation of a softcore multiplier, i.e., a multiplier architecture which can be efficiently mapped to the slice resources of modern Xilinx FPGAs. Instead of dividing the multiplication into the generation of partial products and the summation using a compressor tree, as done in modern multipliers, an array-like architecture is proposed. Each row of the array generates a partial product which is directly added to results of previous rows using the fast carry chain. A radix-4 Booth encoding/decoding is used to reduce the I/O count of the partial product generation which makes it possible to map both, the Booth encoder and decoder, into a single 6-input look up table (LUT). Like a conventional Booth multiplier, this nearly halves the number of rows compared to a ripple carry array multiplier. In addition, the compressor tree is completely avoided and an efficient and regular structure retains that uses up to 50% less slice resources compared to previous approaches and offers a multiply accumulate (MAC) operation without extra resources.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"17 1","pages":"18-25"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79272002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"RNS Arithmetic Approach in Lattice-Based Cryptography: Accelerating the \"Rounding-off\" Core Procedure","authors":"J. Bajard, J. Eynard, Nabil Merkiche, T. Plantard","doi":"10.1109/ARITH.2015.30","DOIUrl":"https://doi.org/10.1109/ARITH.2015.30","url":null,"abstract":"Residue Number Systems (RNS) are naturally considered as an interesting candidate to provide efficient arithmetic for implementations of cryptosystems such as RSA, ECC (Elliptic Curve Cryptography), pairings, etc. More recently, RNS have been used to accelerate fully homomorphic encryption as lattice-based cryptogaphy. In this paper, we present an RNS algorithm resolving the Closest Vector Problem (CVP). This algorithm is particularly efficient for a certain class of lattice basis. It provides a full RNS Babai round-off procedure without any costly conversion into alternative positional number system such as Mixed Radix System (MRS). An optimized Cox-Rower architecture adapted to the proposed algorithm is also presented. The main modifications reside in the Rower unit whose feature is to use only one multiplier. This allows to free two out of three multipliers from the Rower unit by reusing the same one with an overhead of 3 more cycles per inner reduction. An analysis of feasibility of implementation within FPGA is also given.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"11 1","pages":"113-120"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84294698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Precise and Fast Computation of Elliptic Integrals and Functions","authors":"T. Fukushima","doi":"10.1109/ARITH.2015.15","DOIUrl":"https://doi.org/10.1109/ARITH.2015.15","url":null,"abstract":"Summarized is the recent progress of the new methods to compute Legendre's complete and incomplete elliptic integrals of all three kinds and Jacobian elliptic functions. Also reviewed are the entirely new methods to (i) compute the inverse functions of complete elliptic integrals, (ii) invert a general incomplete elliptic integral numerically, and (iii) evaluate the partial derivatives of the elliptic integrals and functions recursively. In order to avoid the information loss against small parameter and/or characteristic, newly introduced are the associate complete and incomplete elliptic integrals. The main techniques used are (i) the piecewise approximation for single variable functions, and (ii) a systematic utilization of the half and double argument transformations and the truncated Maclaurin series expansions for the others. The new methods are of the errors of 5 ulps at most without any chance of cancellation against small input arguments. They run significantly faster than the existing methods: (i) slightly faster than Bulirsch's procedure for the incomplete elliptic integral of the first kind, (ii) 1.5 times faster than Bulirsch's procedure for Jacobian elliptic functions, (iii) 2.5 times faster than Cody's and Bulirsch's procedures for the complete elliptic integrals, and (iv) 3.5 times faster than Carlson's procedures for the incomplete elliptic integrals of the second and third kind. Their Fortran programs are available at https://www.researchgate.net/profile/Toshio_Fukushima/.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"19 1","pages":"50-57"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74016051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Efficient Modular Exponentiation Based on Multiple Multiplications by a Common Operand","authors":"C. Nègre, T. Plantard, J. Robert","doi":"10.1109/ARITH.2015.24","DOIUrl":"https://doi.org/10.1109/ARITH.2015.24","url":null,"abstract":"The main operation in RSA encryption/decryption is the modular exponentiation, which involves a long sequence of modular squarings and multiplications. In this paper, we propose to improve modular multiplications AB, AC which have a common operand. To reach this goal we modify the Montgomery modular multiplication in order to share common computations in AB and AC. We extend this idea to reduce the cost of multiple modular multiplications AB1,...,ABℓ by the same operand A. We then take advantage of these improvements in the Montgomery-ladder and SPA resistant m-ary exponentiation algorithms. The complexity analysis shows that for an RSA modulus of size 2048 bits, the proposed improvements reduce the number of word operations (ADD and MUL) by 14% for the Montgomery-ladder and by 5%-8% for the m-ary exponentiations. Our implementations show a speed-up by 8%-14% for the Montgomery-ladder and by 1%-8% for the m-ary exponentiations for modulus of size 1024, 2048 and 4048 bits.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"8 1","pages":"144-151"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80710241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"An Automatable Formal Semantics for IEEE-754 Floating-Point Arithmetic","authors":"M. Brain, C. Tinelli, Philipp Rümmer, T. Wahl","doi":"10.1109/ARITH.2015.26","DOIUrl":"https://doi.org/10.1109/ARITH.2015.26","url":null,"abstract":"Automated reasoning tools often provide little or no support to reason accurately and efficiently about floating-point arithmetic. As a consequence, software verification systems that use these tools are unable to reason reliably about programs containing floating-point calculations or may give unsound results. These deficiencies are in stark contrast to the increasing awareness that the improper use of floating-point arithmetic in programs can lead to unintuitive and harmful defects in software. To promote coordinated efforts towards building efficient and accurate floating-point reasoning engines, this paper presents a formalization of the IEEE-754 standard for floating-point arithmetic as a theory in many-sorted first-order logic. Benefits include a standardized syntax and unambiguous semantics, allowing tool interoperability and sharing of benchmarks, and providing a basis for automated, formal analysis of programs that process floating-point data.","PeriodicalId":6526,"journal":{"name":"2015 IEEE 22nd Symposium on Computer Arithmetic","volume":"67 1","pages":"160-167"},"PeriodicalIF":0.0,"publicationDate":"2015-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91365934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}