2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)最新文献

筛选
英文 中文
Proceedings of the 25th International Symposium on Computer Arithmetic 第25届计算机算术国际研讨会论文集
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/arith.2018.8464697
{"title":"Proceedings of the 25th International Symposium on Computer Arithmetic","authors":"","doi":"10.1109/arith.2018.8464697","DOIUrl":"https://doi.org/10.1109/arith.2018.8464697","url":null,"abstract":"","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"58 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88508282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Formally-Proved Algorithm to Compute the Correct Average of Decimal Floating-Point Numbers 计算十进制浮点数正确平均值的正式证明算法
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464761
S. Boldo, Florian Faissole, Vincent Tourneur
{"title":"A Formally-Proved Algorithm to Compute the Correct Average of Decimal Floating-Point Numbers","authors":"S. Boldo, Florian Faissole, Vincent Tourneur","doi":"10.1109/ARITH.2018.8464761","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464761","url":null,"abstract":"Some modern processors include decimal floating-point units, with a conforming implementation of the IEEE-754 2008 standard. Unfortunately, many algorithms from the computer arithmetic literature are not correct anymore when computations are done in radix 10. This is in particular the case for the computation of the average of two floating-point numbers. Several radix-2 algorithms are available, including one that provides the correct rounding, but none hold in radix 10. This paper presents a new radix-10 algorithm that computes the correctly-rounded average. To guarantee a higher level of confidence, we also provide a Coq formal proof of our theorems, that takes gradual underflow into account. Note that our formal proof was generalized to ensure this algorithm is correct when computations are done with any even radix.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"11 1","pages":"69-75"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85280178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Karatsuba with Rectangular Multipliers for FPGAs 用于fpga的矩形乘法器
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464809
M. Kumm, O. Gustafsson, F. D. Dinechin, Johannes Kappauf, P. Zipf
{"title":"Karatsuba with Rectangular Multipliers for FPGAs","authors":"M. Kumm, O. Gustafsson, F. D. Dinechin, Johannes Kappauf, P. Zipf","doi":"10.1109/ARITH.2018.8464809","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464809","url":null,"abstract":"This work presents an extension of Karatsuba's method to efficiently use rectangular multipliers as a base for larger multipliers. The rectangular multipliers that motivate this work are the embedded 18 ⨯ 25-bit signed multipliers found in the DSP blocks of recent Xilinx FPGAs: The traditional Karatsuba approach must under-use them as square 18 ⨯ 18 ones. This work shows that rectangular multipliers can be efficiently exploited in a modified Karatsuba method if their input word sizes have a large greatest common divider. In the Xilinx FPG A case, this can be obtained by using the embedded multipliers as 16 ⨯ 24 unsigned and as 17 ⨯ 25 signed ones. The obtained architectures are implemented with due detail to architectural features such as the pre-adders and post-adders available in Xilinx DSP blocks. They are synthesized and compared with traditional Karatsuba, but also with (non-Karatsuba) state-of-the-art tiling techniques that make use of the full rectangular multipliers. The proposed technique improves resource consumption and performance for multipliers of numbers larger than 64 bits.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"43 1","pages":"13-20"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91272278","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Digit Elision for Arbitrary-accuracy Iterative Computation 任意精度迭代计算中的数字省略
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464691
He Li, James J. Davis, John Wickerson, G. Constantinides
{"title":"Digit Elision for Arbitrary-accuracy Iterative Computation","authors":"He Li, James J. Davis, John Wickerson, G. Constantinides","doi":"10.1109/ARITH.2018.8464691","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464691","url":null,"abstract":"We recently proposed the first hardware architecture enabling the iterative solution of systems of linear equations to accuracies limited only by the amount of available memory. This technique, named ARCHITECT, achieves exact numeric computation by using online arithmetic to allow the refinement of results from earlier iterations over time, eschewing rounding error. ARCHITECT has a key drawback, however: often, many more digits than strictly necessary are generated, with this problem exacerbating the more accurate a solution is sought. In this paper, we infer the locations of these superfluous digits within stationary iterative calculations by exploiting online arithmetic's digit dependencies and using forward error analysis. We demonstrate that their lack of computation is guaranteed not to affect the ability to reach a solution of any accuracy. Versus ARCHITECT, our illustrative hardware implementation achieves a geometric mean 20.1× speedup in the solution of a set of representative linear systems through the avoidance of redundant digit calculation. For the computation of high-precision results, we also obtain an up-to 22.4times× memory requirement reduction over the same baseline. Finally, we demonstrate that solvers implemented following our proposals can show superiority over conventional arithmetic implementations by virtue of their runtime-tunable precisions.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"6 1","pages":"107-114"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86552219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
VeriTracer: Context-enriched tracer for floating-point arithmetic analysis VeriTracer:用于浮点算术分析的上下文丰富的跟踪程序
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464687
Yohan Chatelain, P. D. O. Castro, E. Petit, D. Defour, J. Bieder, M. Torrent
{"title":"VeriTracer: Context-enriched tracer for floating-point arithmetic analysis","authors":"Yohan Chatelain, P. D. O. Castro, E. Petit, D. Defour, J. Bieder, M. Torrent","doi":"10.1109/ARITH.2018.8464687","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464687","url":null,"abstract":"VeriTracer automatically instruments a code and traces the accuracy of floating-point variables over time. VeriTracer enriches the visual traces with contextual information such as the call site path in which a value was modified. Contextual information is important to understand how the floating-point errors propagate in complex codes. VeriTracer is implemented as an LLVM compiler tool on top of Verificarlo. We demonstrate how VeriTracer can detect accuracy loss and quantify the impact of using a compensated algorithm on ABINIT, an industrial HPC application for Ab Initio quantum computation.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"1 1","pages":"61-68"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89375412","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
High Density and Performance Multiplication for FPGA FPGA的高密度和高性能乘法
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464695
M. Langhammer, Gregg Baeckler
{"title":"High Density and Performance Multiplication for FPGA","authors":"M. Langhammer, Gregg Baeckler","doi":"10.1109/ARITH.2018.8464695","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464695","url":null,"abstract":"Arithmetic based applications are one of the most common use cases for modern FPGAs. Currently, machine learning is emerging as the fastest growth area for FPG As, renewing an interest in low precision multiplication. There is now a new focus on multiplication in the soft fabric - very high-density systems, consisting of many thousands of operations, are the current norm. In this paper we introduce multiplier regularization, which restructures common multiplier algorithms into smaller, and more efficient architectures. The multiplier structure is parameterizable, and results are given for a continuous range of input sizes, although the algorithm is most efficient for small input precisions. The multiplier is particularly effective for typical machine learning inferencing uses, and the presented cores can be used for dot products required for these applications. Although the examples presented here are optimized for Intel Stratix 10 devices, the concept of regularized arithmetic structures are applicable to generic FPGA LUT architectures. Results are compared to Intel Megafunction IP as well as contrasted with normalized representations of recently published results for Xilinx devices. We report a 10% to 35% smaller area, and a more significant latency reduction, in the range of 25% to 50%, for typical inferencing use cases.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"421 1","pages":"5-12"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72713518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Flexpoint: Predictive Numerics for Deep Learning Flexpoint:深度学习的预测数字
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464801
Valentina Popescu, M. Nassar, Xin Wang, E. Tumer, T. Webb
{"title":"Flexpoint: Predictive Numerics for Deep Learning","authors":"Valentina Popescu, M. Nassar, Xin Wang, E. Tumer, T. Webb","doi":"10.1109/ARITH.2018.8464801","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464801","url":null,"abstract":"Deep learning has been undergoing rapid growth in recent years thanks to its state-of-the-art performance across a wide range of real-world applications. Traditionally neural networks were trained in IEEE-754 binary64 or binary32 format, a common practice in general scientific computing. However, the unique computational requirements of deep neural network training workloads allow for much more efficient and inexpensive alternatives, unleashing a new wave of numerical innovations powering specialized computing hardware. We previously presented Flexpoint, a blocked fixed-point data type combined with a novel predictive exponent management algorithm designed to support training of deep networks without modifications, aiming at a seamless replacement of the binary32 widely in practice today. We showed that Flexpoint with 16-bit mantissa and 5-bit shared exponent (flex16+S) achieved numerical parity to binary32 in training a number of convolutional neural networks. In the current paper we review the continuing trend of predictive numerics enhancing deep neural network training in specialized computing devices such as the Intel®N ervana ™ Neural Network Processor.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"118 1","pages":"1-4"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73082788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Combining Restoring Array and Logarithmic Dividers into an Approximate Hybrid Design 将恢复阵列和对数分频器组合成近似混合设计
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464807
Weiqiang Liu, Jing Li, Tao Xu, Chenghua Wang, P. Montuschi, F. Lombardi
{"title":"Combining Restoring Array and Logarithmic Dividers into an Approximate Hybrid Design","authors":"Weiqiang Liu, Jing Li, Tao Xu, Chenghua Wang, P. Montuschi, F. Lombardi","doi":"10.1109/ARITH.2018.8464807","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464807","url":null,"abstract":"This paper proposes a new design of an approximate hybrid divider (AXHD), which combines the restoring array and the logarithmic dividers to achieve an excellent tradeoff between accuracy and hardware performance. Exact restoring divider cells (EXDCrs) are used to generate the MSBs of the quotient for attaining a high accuracy; the other quotient digits are processed by a logarithmic divider as inexact scheme to improve figures of merit such as power consumption, area and delay. The proposed AXHD is evaluated and analyzed using error and hardware metrics. The proposed design is also compared with the exact restoring divider (EXDr) and previous approximate restoring dividers (AXDrs). The results show that the proposed design achieves very good performance in terms of accuracy and hardware; case studies for image processing also show the validity of the proposed designs.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"24 1","pages":"92-98"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82612612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Approximate Fixed-Point Elementary Function Accelerator for the SpiNNaker-2 Neuromorphic Chip SpiNNaker-2神经形态芯片的近似定点初等函数加速器
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464785
M. Mikaitis, D. Lester, D. Shang, S. Furber, Gengting Liu, J. Garside, Stefan Scholze, S. Höppner, Andreas Dixius
{"title":"Approximate Fixed-Point Elementary Function Accelerator for the SpiNNaker-2 Neuromorphic Chip","authors":"M. Mikaitis, D. Lester, D. Shang, S. Furber, Gengting Liu, J. Garside, Stefan Scholze, S. Höppner, Andreas Dixius","doi":"10.1109/ARITH.2018.8464785","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464785","url":null,"abstract":"Neuromorphic chips are used to model biologically inspired Spiking-Neural-Networks(SNNs) where most models are based on differential equations. Equations for most SNN algorithms usually contain variables with one or more $e^{x}$ components. SpiNNaker is a digital neuromorphic chip that has so far been using pre-calculated look-up tables for exponential function. However this approach is limited because the memory requirements grow as more complex neural models are developed. To save already limited memory resources in the next generation SpiNNaker chip, we are including a fast exponential function in the silicon. In this paper we analyse iterative algorithms for elementary functions and show how to build a single hardware accelerator for exp and natural log, for a neuromorphic chip prototype, to be manufactured in a 22 nm FDSOI process. We present the accelerator that has algorithmic level approximation control, allowing it to trade precision for latency and energy efficiency. As an addition to neuromorphic chip application, we provide analysis of a parameterized elementary function unit that can be tailored for other systems with different power, area, accuracy and latency constraints.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"16 1","pages":"37-44"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81908145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
New Area Record for the AES Combined S-Box/Inverse S-Box AES组合s盒/逆s盒的新区域记录
2018 IEEE 25th Symposium on Computer Arithmetic (ARITH) Pub Date : 2018-06-01 DOI: 10.1109/ARITH.2018.8464780
A. Reyhani-Masoleh, Mostafa M. I. Taha, Doaa Ashmawy
{"title":"New Area Record for the AES Combined S-Box/Inverse S-Box","authors":"A. Reyhani-Masoleh, Mostafa M. I. Taha, Doaa Ashmawy","doi":"10.1109/ARITH.2018.8464780","DOIUrl":"https://doi.org/10.1109/ARITH.2018.8464780","url":null,"abstract":"The AES combined S-box/inverse S-box is a single construction that is shared between the encryption and decryption data paths of the AES. The currently most compact implementation of the AES combined S-box/inverse S-box is Canright's design, introduced back in 2005. Since then, the research community has introduced several optimizations over the S-box only, however the combined S-boxlinverse S-box received little attention. In this paper, we propose a new AES combined S-boxlinverse S-box design that is both smaller and faster than Canright's design. We achieve this goal by proposing to use new tower field and optimizing each and every block inside the combined architecture for this field. Our complexity analysis and ASIC implementation results in the CMOS STM 65nm and NanGate 15nm technologies show that our design outperforms the counterparts in terms of area and speed.","PeriodicalId":6576,"journal":{"name":"2018 IEEE 25th Symposium on Computer Arithmetic (ARITH)","volume":"30 1","pages":"145-152"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77632722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
相关产品
×
本文献相关产品
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信