2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)最新文献

筛选
英文 中文
Generating Very Large RNS Bases 生成非常大的RNS基
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/arith54963.2022.00027
J. Bajard, Kazuhide Fukushima, T. Plantard, Arnaud Sipasseuth
{"title":"Generating Very Large RNS Bases","authors":"J. Bajard, Kazuhide Fukushima, T. Plantard, Arnaud Sipasseuth","doi":"10.1109/arith54963.2022.00027","DOIUrl":"https://doi.org/10.1109/arith54963.2022.00027","url":null,"abstract":"","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129774683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Accelerating Variants of the Conjugate Gradient with the Variable Precision Processor 用变精度处理器加速共轭梯度的变分
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/ARITH54963.2022.00017
Y. Durand, E. Guthmuller, C. F. Tortolero, Jérôme Fereyre, Andrea Bocco, Riccardo Alidori
{"title":"Accelerating Variants of the Conjugate Gradient with the Variable Precision Processor","authors":"Y. Durand, E. Guthmuller, C. F. Tortolero, Jérôme Fereyre, Andrea Bocco, Riccardo Alidori","doi":"10.1109/ARITH54963.2022.00017","DOIUrl":"https://doi.org/10.1109/ARITH54963.2022.00017","url":null,"abstract":"Linear algebra kernels such as linear solvers, eigen-solvers are the actual working engine underneath many scientific applications. The growing scale of these applications has led researchers to rely on high-precision computing for improving their efficiency and their stability. In this work, we investigate the impact of arbitrary extended precision on multiple variants of the Conjugate Gradient method (CG). We show how our VRP processor improves the convergence and the efficiency of these kernels. We also illustrate how our set of tools (library, software environment) enables to migrate legacy applications in a fast and intuitive way while preserving high-performance. We observe up to an 8X improvements on kernel iteration count, and up to a 40 % improvement on latency. Nevertheless, the main benefit is the stability gained with the precision. It makes it possible to resolve larger and ill-conditioned systems without costly compensating techniques.","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131969827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Formal Verification of a Chained Multiply-Add Design: Combining Theorem Proving and Equivalence Checking 链式乘加设计的形式化验证:结合定理证明与等价检验
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/ARITH54963.2022.00030
David M. Russinoff, J. Bruguera, C. Chau, M. Manjrekar, Nicholas Pfister, Harsha Valsaraju
{"title":"Formal Verification of a Chained Multiply-Add Design: Combining Theorem Proving and Equivalence Checking","authors":"David M. Russinoff, J. Bruguera, C. Chau, M. Manjrekar, Nicholas Pfister, Harsha Valsaraju","doi":"10.1109/ARITH54963.2022.00030","DOIUrl":"https://doi.org/10.1109/ARITH54963.2022.00030","url":null,"abstract":"We present a hybrid methodology for the formal verification of arithmetic RTL designs that combines sequential logic equivalence checking with interactive theorem proving in a two-step process. First, an intermediate model of the design is extracted by hand and coded in Restricted Algorithmic C, a simple C subset augmented by the C++ register class templates of Algorithmic C, which provide the bit manipulation features of Verilog. The model is designed to mirror the RTL microarchitecture closely enough to allow efficient equivalence checking, but sufficiently abstract to be amenable to formal analysis. The model is then automatically translated to the logic of the ACL2 theorem prover, which is used to establish correctness with respect to an architectural specification. As an illustration, we describe the modeling and proof of correctness of a chained multiply-add module, designed to test techniques for area and power reduction and intended for implementation in future Arm graphics nrocessors.","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131129846","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Low-Latency and High-Bandwidth Pipelined Radix-64 Division and Square Root Unit 低延迟和高带宽管道64根除法和平方根单位
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/ARITH54963.2022.00012
J. Bruguera
{"title":"Low-Latency and High-Bandwidth Pipelined Radix-64 Division and Square Root Unit","authors":"J. Bruguera","doi":"10.1109/ARITH54963.2022.00012","DOIUrl":"https://doi.org/10.1109/ARITH54963.2022.00012","url":null,"abstract":"Digit-recurrence algorithms are widely used in actual microprocessors to compute floating-point division and square root. These iterative algorithms present a good trade-off in terms of performance, area and power. Commercial processors have non-pipelined division and square root units where part of the logic is used over several cycles. The main drawbacks of these non-pipelined units are the long latency of the traditional division and square root implementations, the low bandwidth (or throughput) due to the reuse of part of the logic over several cycles, and its hardware complexity with separated logic for division and square root. We present a radix-64 floating-point division and square root algorithm with a common iteration for division and square root and where each radix-64 iteration is made of two simpler radix-8 iterations. The radix-64 algorithm allows to get low-latency operations, and the common division and square root radix-64 iteration results in some area reduction. The algorithm is mapped into a low-latency and high-bandwidth pipelined unit.","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130393698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A BF16 FMA is All You Need for DNN Training 一个BF16 FMA是所有你需要的DNN训练
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/arith54963.2022.00011
John Osorio Ríos, Adrià Armejach, E. Petit, G. Henry, Marc Casas
{"title":"A BF16 FMA is All You Need for DNN Training","authors":"John Osorio Ríos, Adrià Armejach, E. Petit, G. Henry, Marc Casas","doi":"10.1109/arith54963.2022.00011","DOIUrl":"https://doi.org/10.1109/arith54963.2022.00011","url":null,"abstract":"","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"85 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115812262","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhanced Floating-Point Adder with Full Denormal Support 增强浮点加法器与完全正常的支持
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/ARITH54963.2022.00015
Jongwook Sohn, David K. Dean, Eric E. Quintana, Wing Shek Wong
{"title":"Enhanced Floating-Point Adder with Full Denormal Support","authors":"Jongwook Sohn, David K. Dean, Eric E. Quintana, Wing Shek Wong","doi":"10.1109/ARITH54963.2022.00015","DOIUrl":"https://doi.org/10.1109/ARITH54963.2022.00015","url":null,"abstract":"This paper presents an enhanced floating-point adder (FADD) design for the Intel E-Core processor. Floating-point addition and subtraction are two of the most widely used operations in many applications. The proposed FADD is executed in 2 cycles, fully pipelined, handles SSE/AVX operations for scalar/packed IEEE single and double precision, and supports all four rounding modes. Also, the proposed FADD fully supports both denormal inputs and underflow outputs without microcode assistance. To achieve the 2-cycle FADD with full denormal support, several optimization techniques are applied: split path algorithm, early alignment and sticky logic, parallel addition, rounding and all-ones detection, and modified leading zero anticipation (LZA) for masking the underflow. As a result, the proposed FADD achieved not only full denormal support but also about 12.5% reduced latency compared to the traditional FADD designs.","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114931185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PMNS for efficient arithmetic and small memory cost PMNS具有高效的运算和较小的存储开销
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/ARITH54963.2022.00023
Fangan-Yssouf Dosso, J. Robert, P. Véron
{"title":"PMNS for efficient arithmetic and small memory cost","authors":"Fangan-Yssouf Dosso, J. Robert, P. Véron","doi":"10.1109/ARITH54963.2022.00023","DOIUrl":"https://doi.org/10.1109/ARITH54963.2022.00023","url":null,"abstract":"","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125883851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Point-Targeted Sparseness and Ling Transforms on Parallel Prefix Adder Trees 并行前缀加法树的点目标稀疏性和Ling变换
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/ARITH54963.2022.00021
Teodor-Dumitru Ene, J. Stine
{"title":"Point-Targeted Sparseness and Ling Transforms on Parallel Prefix Adder Trees","authors":"Teodor-Dumitru Ene, J. Stine","doi":"10.1109/ARITH54963.2022.00021","DOIUrl":"https://doi.org/10.1109/ARITH54963.2022.00021","url":null,"abstract":"Rephrasing binary addition as a parallel prefix tree problem allows for the generation of high-performance architectures with logarithmic delay. Modern literature and implementation seeks to explore this prefix tree design space in order to identify optimal circuits for each target application. This paper broadens the scope of the design space by treating both preprocessing and post-processing nodes as malleable parts of the tree structure. Structures obtained through this novel approach are shown to have superior performance. Implementation results are presented using the SkyWater Open Source 130nm PDK and the open-source tools developed by this paper are made available.","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"73 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132866332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Bounding the Round-Off Error of the Upwind Scheme for Advection 平流逆风方案舍入误差的边界
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/arith54963.2022.00031
Louise Ben Salem-Knapp, S. Boldo, William Weens
{"title":"Bounding the Round-Off Error of the Upwind Scheme for Advection","authors":"Louise Ben Salem-Knapp, S. Boldo, William Weens","doi":"10.1109/arith54963.2022.00031","DOIUrl":"https://doi.org/10.1109/arith54963.2022.00031","url":null,"abstract":"","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129371571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient Reduction Algorithms for Special Gaussian Integer Moduli 特殊高斯整数模的高效约简算法
2022 IEEE 29th Symposium on Computer Arithmetic (ARITH) Pub Date : 2022-09-01 DOI: 10.1109/ARITH54963.2022.00029
Malek Safieh, F. D. Santis
{"title":"Efficient Reduction Algorithms for Special Gaussian Integer Moduli","authors":"Malek Safieh, F. D. Santis","doi":"10.1109/ARITH54963.2022.00029","DOIUrl":"https://doi.org/10.1109/ARITH54963.2022.00029","url":null,"abstract":"Gaussian integers are a subset of the complex numbers with integers as real and imaginary parts. When Gaussian integers are equipped with modulo operations, they form Gaussian integer rings or fields, depending on the specific choice of the modulus. Arithmetic on Gaussian integers can offer advantages in terms of operand size and improved parallelism, due to independent calculation of the real and imaginary parts. However, although Gaussian integer modulo reduction is the fundamental operation to enable computations in finite Gaussian integer rings and fields, efficient algorithms for Gaussian integer modulo reduction have not been widely investigated so far. In this work, we fill this gap and present efficient reduction algorithms for Gaussian integer moduli of special forms. Indeed, we demonstrate that there exist different classes of Gaussian integer moduli allowing for very fast reductions. Finally, we show that the computational complexity of the proposed algorithm is significantly reduced compared with generic Gaussian integer reduction methods known to date, e.g., Montgomery-based reduction for Gaussian integers.","PeriodicalId":268661,"journal":{"name":"2022 IEEE 29th Symposium on Computer Arithmetic (ARITH)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128809059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信