2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)最新文献_第2页

Augmented Arithmetic Operations Proposed for IEEE-754 2018 针对IEEE-754 2018提出的增广算术运算

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464813

Jason Riedy, J. Demmel

引用次数: 11

On Various Ways to Split a Floating-Point Number 浮点数分割的各种方法

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464793

C. Jeannerod, J. Muller, P. Zimmermann

引用次数: 10

A New Variant of the Barrett Algorithm Applied to Quotient Selection 应用于商选择的Barrett算法的一种新变体

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464771

Niall Emmart, Fangyu Zheng, C. Weems

引用次数: 0

A High Throughput Polynomial and Rational Function Approximations Evaluator 一个高通量多项式和有理函数近似求值器

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464778

N. Brisebarre, G. Constantinides, Milos Ercezovac, Silviu-Ioan Filip, Matei Iştoan, J. Muller

引用次数: 2

Enhanced Vector Math Support on the Intel®AVX-512 Architecture 增强的矢量数学对英特尔®AVX-512架构的支持

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464794

Cristina S. Anderson, Jingwei Zhang, Marius Cornea

引用次数: 6

Fast multiplication of binary polynomials with the forthcoming vectorized VPCLMULQDQ instruction 二元多项式的快速乘法与即将到来的矢量化VPCLMULQDQ指令

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464777

Nir Drucker, S. Gueron, V. Krasnov

引用次数: 10

Tunable Floating-Point for Energy Efficient Accelerators 可调浮点节能加速器

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464797

A. Nannarelli

引用次数: 10

Radix-64 Floating-Point Divider 基数64浮点除法器

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464815

J. Bruguera

引用次数: 9

The Comeback of Reed Solomon Codes 里德·所罗门密码的回归

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464690

Nir Drucker, S. Gueron, V. Krasnov

{"title":"The Comeback of Reed Solomon Codes","authors":"Nir Drucker, S. Gueron, V. Krasnov","doi":"10.1109/ARITH.2018.8464690","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464690","url":null,"abstract":"Distributed storage systems utilize erasure codes to reduce their storage costs while efficiently handling failures. Many of these codes (e. g., Reed-Solomon (RS) codes) rely on Galois Field (GF) arithmetic, which is considered to be fast when the field characteristic is 2. Nevertheless, some developments in the field of erasure codes offer new efficient techniques that require mostly XOR operations, and are thus faster than GF operations. Recently, Intel announced [1] that its future architecture (codename “Ice Lake”) will introduce new set of instructions called Galois Field New Instruction (GF-NI). These instructions allow software flows to perform vector and matrix multiplications over GF (28) on the wide registers that are available on the AVX512 architectures. In this paper, we explain the functionality of these instructions, and demonstrate their usage for some fast computations in GF(28). We also use the Intel® Intelligent Storage Acceleration Library (ISA-L) in order to estimate potential future improvement for erasure codes that are based on RS codes. Our results predict $approx 1.4mathrm{x}$ speedup for vectorized multiplication, and 1.83x speedup for the actual encoding.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"56 1","pages":"125-129"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82556632","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Faster Modular Exponentiation Using Double Precision Floating Point Arithmetic on the GPU 在GPU上使用双精度浮点运算实现更快的模幂运算

2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464792

Niall Emmart, Fangyu Zheng, C. Weems

{"title":"Faster Modular Exponentiation Using Double Precision Floating Point Arithmetic on the GPU","authors":"Niall Emmart, Fangyu Zheng, C. Weems","doi":"10.1109/ARITH.2018.8464792","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464792","url":null,"abstract":"This paper presents a new approach to integer multiple precision (MP) modular exponentiation, using double-precision floating point (DPF) operations, that is suitable for GPU implementation. We show speedups ranging from 20 % to 34 % over the best prior G PU times for sizes corresponding to common RSA cryptographic operations (2048 to 4096 bits). Three techniques are described. First, by adding 2104to the high half of the product, and 252 to the low half, we set the implicit leading 1 in the DPF mantissa so that the full 52 explicit bits are available for each half of the 104-bit products of samples. Second, the DPF values are cast bitwise to 64-bit integers for adding the column sums to get the MP result. Normally the cast would require masking off the exponents, but because they are constant, we can include them in the column sums and correct just once for their total. Third, by initializing the column sums with the appropriate negative value to compensate for the exponent sums, no corrective subtraction is needed. Our implementation on an NVIDIA GTX Titan Black GPU achieves between 132.5K and 161.9K modular exponentiations per second of size 1024 bits, with latencies ranging from 21.7 ms to 17.8 ms, making it practical for online RSA applications. Proportional results are shown for 1536 and 2048 bits. The implementation is so efficient that its maximum sustained performance is actually bounded by the thermal limit of the GPU.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"2013 1","pages":"130-137"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82608726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10