CPFloat: A C Library for Simulating Low-precision Arithmetic

IF 3.2 1区数学 Q2 COMPUTER SCIENCE, SOFTWARE ENGINEERING

ACM Transactions on Mathematical Software Pub Date : 2023-06-17 DOI:https://dl.acm.org/doi/10.1145/3585515

Massimiliano Fasi, Mantas Mikaitis

{"title":"CPFloat: A C Library for Simulating Low-precision Arithmetic","authors":"Massimiliano Fasi, Mantas Mikaitis","doi":"https://dl.acm.org/doi/10.1145/3585515","DOIUrl":null,"url":null,"abstract":"<p>One can simulate low-precision floating-point arithmetic via software by executing each arithmetic operation in hardware and then rounding the result to the desired number of significant bits. For IEEE-compliant formats, rounding requires only standard mathematical library functions, but handling subnormals, underflow, and overflow demands special attention, and numerical errors can cause mathematically correct formulae to behave incorrectly in finite arithmetic. Moreover, the ensuing implementations are not necessarily efficient, as the library functions these techniques build upon are typically designed to handle a broad range of cases and may not be optimized for the specific needs of rounding algorithms. CPFloat is a C library for simulating low-precision arithmetics. It offers efficient routines for rounding, performing mathematical computations, and querying properties of the simulated low-precision format. The software exploits the bit-level floating-point representation of the format in which the numbers are stored and replaces costly library calls with low-level bit manipulations and integer arithmetic. In numerical experiments, the new techniques bring a considerable speedup (typically one order of magnitude or more) over existing alternatives in C, C++, and MATLAB. To our knowledge, CPFloat is currently the most efficient and complete library for experimenting with custom low-precision floating-point arithmetic.</p>","PeriodicalId":50935,"journal":{"name":"ACM Transactions on Mathematical Software","volume":"69 ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2023-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Mathematical Software","FirstCategoryId":"94","ListUrlMain":"https://doi.org/https://dl.acm.org/doi/10.1145/3585515","RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}

引用次数: 0

Abstract

One can simulate low-precision floating-point arithmetic via software by executing each arithmetic operation in hardware and then rounding the result to the desired number of significant bits. For IEEE-compliant formats, rounding requires only standard mathematical library functions, but handling subnormals, underflow, and overflow demands special attention, and numerical errors can cause mathematically correct formulae to behave incorrectly in finite arithmetic. Moreover, the ensuing implementations are not necessarily efficient, as the library functions these techniques build upon are typically designed to handle a broad range of cases and may not be optimized for the specific needs of rounding algorithms. CPFloat is a C library for simulating low-precision arithmetics. It offers efficient routines for rounding, performing mathematical computations, and querying properties of the simulated low-precision format. The software exploits the bit-level floating-point representation of the format in which the numbers are stored and replaces costly library calls with low-level bit manipulations and integer arithmetic. In numerical experiments, the new techniques bring a considerable speedup (typically one order of magnitude or more) over existing alternatives in C, C++, and MATLAB. To our knowledge, CPFloat is currently the most efficient and complete library for experimenting with custom low-precision floating-point arithmetic.

查看原文本刊更多论文

一个模拟低精度算术的C语言库

可以通过软件模拟低精度浮点运算，方法是在硬件中执行每个算术运算，然后将结果四舍五入到所需的有效位数。对于符合ieee的格式，舍入只需要标准的数学库函数，但是处理次法线、下溢和溢出需要特别注意，并且数值错误可能导致数学上正确的公式在有限算术中表现不正确。此外，随后的实现不一定是高效的，因为构建这些技术的库函数通常是为处理广泛的情况而设计的，可能没有针对舍入算法的特定需求进行优化。CPFloat是一个用于模拟低精度算术的C库。它为舍入、执行数学计算和查询模拟低精度格式的属性提供了有效的例程。该软件利用存储数字的格式的位级浮点表示，并用低级位操作和整数运算取代昂贵的库调用。在数值实验中，与现有的C、c++和MATLAB替代方案相比，新技术带来了相当大的加速(通常是一个数量级或更多)。据我们所知，CPFloat是目前用于实验自定义低精度浮点算法的最有效和最完整的库。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

ACM Transactions on Mathematical Software 工程技术-计算机：软件工程

CiteScore

5.00

自引率

3.70%

发文量

审稿时长

>12 weeks

期刊介绍： As a scientific journal, ACM Transactions on Mathematical Software (TOMS) documents the theoretical underpinnings of numeric, symbolic, algebraic, and geometric computing applications. It focuses on analysis and construction of algorithms and programs, and the interaction of programs and architecture. Algorithms documented in TOMS are available as the Collected Algorithms of the ACM at calgo.acm.org.