2013 IEEE 21st Symposium on Computer Arithmetic最新文献

筛选
英文 中文
Relation Collection for the Function Field Sieve 函数字段筛选的关系集合
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.28
J. Detrey, P. Gaudry, M. Videau
{"title":"Relation Collection for the Function Field Sieve","authors":"J. Detrey, P. Gaudry, M. Videau","doi":"10.1109/ARITH.2013.28","DOIUrl":"https://doi.org/10.1109/ARITH.2013.28","url":null,"abstract":"In this paper, we focus on the relation collection step of the Function Field Sieve (FFS), which is to date the best algorithm known for computing discrete logarithms in small-characteristic finite fields of cryptographic sizes. Denoting such a finite field by Fpn, where p is much smaller than n, the main idea behind this step is to find polynomials of the form a(t)-b(t)x in Fp[t][x] which, when considered as principal ideals in carefully selected function fields, can be factored into products of low-degree prime ideals. Such polynomials are called \"relations\", and current record-sized discrete-logarithm computations need billions of those. Collecting relations is therefore a crucial and extremely expensive step in FFS, and a practical implementation thereof requires heavy use of cache-aware sieving algorithms, along with efficient polynomial arithmetic over Fp[t]. This paper presents the algorithmic and arithmetic techniques which were put together as part of a new public implementation of FFS, aimed at medium-to record-sized computations.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126944324","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
SIPE: Small Integer Plus Exponent SIPE:小整数加指数
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.22
Vincent Lefèvre
{"title":"SIPE: Small Integer Plus Exponent","authors":"Vincent Lefèvre","doi":"10.1109/ARITH.2013.22","DOIUrl":"https://doi.org/10.1109/ARITH.2013.22","url":null,"abstract":"SIPE (Small Integer Plus Exponent) is a mini-library in the form of a C header file, to perform floating-point computations in very low precisions with correct rounding to nearest in radix 2. The goal of such a tool is to do proofs of algorithms/properties or computations of tight error bounds in these precisions by exhaustive tests, in order to try to generalize them to higher precisions. The currently supported operations are addition, subtraction, multiplication (possibly with the error term), FMA, and miscellaneous comparisons and conversions. Timing comparisons have been done with hardware IEEE-754 floating point and with GNU MPFR.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130293316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Numerical Reproducibility and Accuracy at ExaScale 在ExaScale上的数值再现性和准确性
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.43
J. Demmel, Hong Diep Nguyen
{"title":"Numerical Reproducibility and Accuracy at ExaScale","authors":"J. Demmel, Hong Diep Nguyen","doi":"10.1109/ARITH.2013.43","DOIUrl":"https://doi.org/10.1109/ARITH.2013.43","url":null,"abstract":"Given current hardware trends, ExaScale computing (1018 floating point operations per second) is projected to be available in less than a decade, achieved by using a huge number of processors, of order 109. Given the likely hardware heterogeneity in both platform and network, and the possibility of intermittent failures, dynamic scheduling will be needed to adapt to changing resources and loads. This will make it likely that repeated runs of a program will not execute operations like reductions in exactly the same order. This in turn will make reproducibility, i.e. getting bitwise identical results from run to run, difficult to achieve, because floating point operations like addition are not associative, so computing sums in different orders often leads to different results. Indeed, this is already a challenge on today's platforms.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114767674","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Improved Architectures for a Floating-Point Fused Dot Product Unit 浮点融合点积单元的改进架构
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.26
Jongwook Sohn, E. Swartzlander
{"title":"Improved Architectures for a Floating-Point Fused Dot Product Unit","authors":"Jongwook Sohn, E. Swartzlander","doi":"10.1109/ARITH.2013.26","DOIUrl":"https://doi.org/10.1109/ARITH.2013.26","url":null,"abstract":"This paper presents improved architectures for a floating-point fused two-term dot product unit. The floating-point fused dot product unit is useful for a wide variety of digital signal processing (DSP) applications including complex multiplication and fast Fourier transform (FFT) and discrete cosine transform (DCT) butterfly operations. In order to improve the performance, a new alignment scheme, early normalization, a four-input leading zero anticipation (LZA), a dual-path algorithm, and pipelining are applied. The proposed designs are implemented for single precision and synthesized with a 45nm standard cell library. The proposed dual-path design reduces the latency by 25% compared to the traditional floating-point fused dot product unit. Based on a data flow analysis, the proposed design can be split into three pipeline stages. Since the latencies of the three stages are fairly well balanced, the throughput is increased by a factor of 2.8 compared to the non-pipelined dual-path design.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132351398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 42
Comparison between Binary64 and Decimal64 Floating-Point Numbers Binary64和Decimal64浮点数的比较
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.23
N. Brisebarre, M. Mezzarobba, J. Muller, C. Lauter
{"title":"Comparison between Binary64 and Decimal64 Floating-Point Numbers","authors":"N. Brisebarre, M. Mezzarobba, J. Muller, C. Lauter","doi":"10.1109/ARITH.2013.23","DOIUrl":"https://doi.org/10.1109/ARITH.2013.23","url":null,"abstract":"We introduce a software-oriented algorithm that allows one to quickly compare a binary64 floating-point (FP) number and a decimal64 FP number, assuming the \"binary encoding\" of the decimal formats specified by the IEEE 754-2008 standard for FP arithmetic is used. It is a two-step algorithm: a first pass, based on the exponents only, makes it possible to quickly eliminate most cases, then when the first pass does not suffice, a more accurate second pass is required. We provide an implementation of several variants of our algorithm, and compare them.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115341779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Fast Reproducible Floating-Point Summation 快速可重复浮点求和
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.9
J. Demmel, Hong Diep Nguyen
{"title":"Fast Reproducible Floating-Point Summation","authors":"J. Demmel, Hong Diep Nguyen","doi":"10.1109/ARITH.2013.9","DOIUrl":"https://doi.org/10.1109/ARITH.2013.9","url":null,"abstract":"Reproducibility, i.e. getting the bitwise identical floating point results from multiple runs of the same program, is a property that many users depend on either for debugging or correctness checking in many codes [1]. However, the combination of dynamic scheduling of parallel computing resources, and floating point nonassociativity, make attaining reproducibility a challenge even for simple reduction operations like computing the sum of a vector of numbers in parallel. We propose a technique for floating point summation that is reproducible independent of the order of summation. Our technique uses Rump's algorithm for error-free vector transformation [2], and is much more efficient than using (possibly very) high precision arithmetic. Our algorithm trades off efficiency and accuracy: we reproducibly attain reasonably accurate results (with an absolute error bound c · n2 · macheps · max |vi| for a small constant c) with just 2n + O(1) floating-point operations, and quite accurate results (with an absolute error bound c · n3 · macheps2 · max |vi| with 5n + O(1) floating point operations, both with just two reduction operations. Higher accuracies are also possible by increasing the number of error-free transformations. As long as the same rounding mode is used, results computed by the proposed algorithms are reproducible for any run on any platform.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124855960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 76
On the Componentwise Accuracy of Complex Floating-Point Division with an FMA 基于FMA的复杂浮点除法分量精度研究
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.8
C. Jeannerod, N. Louvet, J. Muller
{"title":"On the Componentwise Accuracy of Complex Floating-Point Division with an FMA","authors":"C. Jeannerod, N. Louvet, J. Muller","doi":"10.1109/ARITH.2013.8","DOIUrl":"https://doi.org/10.1109/ARITH.2013.8","url":null,"abstract":"This paper deals with the accuracy of complex division in radix-two floating-point arithmetic. Assuming that a fused multiply-add (FMA) instruction is available and that no underflow/overflow occurs, we study how to ensure high relative accuracy in the component wise sense. Since this essentially reduces to evaluating accurately three expressions of the form ac+bd, an obvious approach would be to perform three calls to Kahan's compensated algorithm for 2 by 2 determinants. However, in the context of complex division, two of those expressions are such that ac and bd have the same sign, suggesting that cheaper schemes should be used here (since cancellation cannot occur). We first give a detailed accuracy analysis of such schemes for the sum of two nonnegative products, providing not only sharp bounds on both their absolute and relative errors, but also sufficient conditions for the output of one of them to coincide with the output of Kahan's algorithm. By combining Kahan's algorithm with this particular scheme, we then deduce two new division algorithms. Our first algorithm is a straight-line program whose component wise relative error is always at most 5u+13u2 with u the unit round off, we also provide examples of inputs for which the error of this algorithm approaches 5u, thus showing that our upper bound is essentially the best possible. When tests are allowed we show with a second algorithm that the bound above can be further reduced to 4.5u+9u2, and that this improved bound is reasonably sharp.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130589541","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
How to Compute the Area of a Triangle: A Formal Revisit 如何计算三角形的面积:一个正式的回顾
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.29
S. Boldo
{"title":"How to Compute the Area of a Triangle: A Formal Revisit","authors":"S. Boldo","doi":"10.1109/ARITH.2013.29","DOIUrl":"https://doi.org/10.1109/ARITH.2013.29","url":null,"abstract":"Mathematical values are usually computed using well-known mathematical formulas without thinking about their accuracy, which may turn awful with particular instances. This is the case for the computation of the area of a triangle. When the triangle is needle-like, the common formula has a very poor accuracy. Kahan proposed in 1986 an algorithm he claimed correct within a few ulps. Goldberg took over this algorithm in 1991 and gave a precise error bound. This article presents a formal proof of this algorithm, an improvement of its error bound and new investigations in case of underflow.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130149775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Truncated Logarithmic Approximation 截断对数近似
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.34
Michael B. Sullivan, E. Swartzlander
{"title":"Truncated Logarithmic Approximation","authors":"Michael B. Sullivan, E. Swartzlander","doi":"10.1109/ARITH.2013.34","DOIUrl":"https://doi.org/10.1109/ARITH.2013.34","url":null,"abstract":"The speed and levels of integration of modern devices have risen to the point that arithmetic can be performed very fast and with high precision. Precise arithmetic comes at a hidden cost-by computing results past the precision they require, systems inefficiently utilize their resources. Numerous designs over the past fifty years have demonstrated scalable efficiency by utilizing approximate logarithms. Many such designs are based off of a linear approximation algorithm developed by Mitchell. This paper evaluates a truncated form of binary logarithm as a replacement for Mitchell's algorithm. The truncated approximate logarithm simultaneously improves the efficiency and precision of Mitchell's approximation while remaining simple to implement.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115257861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Precision, Accuracy, and Rounding Error Propagation in Exascale Computing 百亿亿次计算中的精度、准确度和舍入误差传播
2013 IEEE 21st Symposium on Computer Arithmetic Pub Date : 2013-04-07 DOI: 10.1109/ARITH.2013.42
Marius Cornea
{"title":"Precision, Accuracy, and Rounding Error Propagation in Exascale Computing","authors":"Marius Cornea","doi":"10.1109/ARITH.2013.42","DOIUrl":"https://doi.org/10.1109/ARITH.2013.42","url":null,"abstract":"Exascale level computers might be available in less than a decade. Computer architects are already thinking of, and planning to achieve such levels of performance. It is reasonable to expect that researchers and engineers will carry out scientific and engineering computations more complex than ever before, and will attempt breakthroughs not possible today. If the size of the problems solved on such machines scales accordingly, we may face new issues related to precision, accuracy, performance, and programmability. The paper examines some relevant aspects of this problem.","PeriodicalId":211528,"journal":{"name":"2013 IEEE 21st Symposium on Computer Arithmetic","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2013-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122165839","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信