Seamless Compiler Integration of Variable Precision Floating-Point Arithmetic

2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO) Pub Date : 2021-02-27 DOI:10.1109/CGO51591.2021.9370331

T. Jost, Y. Durand, Christian Fabre, Albert Cohen, F. Pétrot

{"title":"Seamless Compiler Integration of Variable Precision Floating-Point Arithmetic","authors":"T. Jost, Y. Durand, Christian Fabre, Albert Cohen, F. Pétrot","doi":"10.1109/CGO51591.2021.9370331","DOIUrl":null,"url":null,"abstract":"Floating-Point (FP) units in processors are generally limited to supporting a subset of formats defined by the IEEE 754 standard. As a result, high-efficiency languages and optimizing compilers for high-performance computing only support IEEE standard types and applications needing higher precision involve cumbersome memory management and calls to external libraries, resulting in code bloat and making the intent of the program unclear. We present an extension of the C type system that can represent generic FP operations and formats, supporting both static precision and dynamically variable precision. We design and implement a compilation flow bridging the abstraction gap between this type system and low-level FP instructions or software libraries. The effectiveness of our solution is demonstrated through an LLVM-based implementation, leveraging aggressive optimizations in LLVM including the Polly loop nest optimizer, which targets two backend code generators: one for the ISA of a variable precision FP arithmetic coprocessor, and one for the MPFR multi-precision floating-point library. Our optimizing compilation flow targeting MPFR outperforms the Boost programming interface for the MPFR library by a factor of 1.80 × and 1.67 × in sequential execution of the Poly Bench and RAJAPerf suites, respectively, and by a factor of 7.62 x on an 8-core (and 16-thread) machine for RAJAPerf in OpenMP.","PeriodicalId":275062,"journal":{"name":"2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)","volume":"45 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CGO51591.2021.9370331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Floating-Point (FP) units in processors are generally limited to supporting a subset of formats defined by the IEEE 754 standard. As a result, high-efficiency languages and optimizing compilers for high-performance computing only support IEEE standard types and applications needing higher precision involve cumbersome memory management and calls to external libraries, resulting in code bloat and making the intent of the program unclear. We present an extension of the C type system that can represent generic FP operations and formats, supporting both static precision and dynamically variable precision. We design and implement a compilation flow bridging the abstraction gap between this type system and low-level FP instructions or software libraries. The effectiveness of our solution is demonstrated through an LLVM-based implementation, leveraging aggressive optimizations in LLVM including the Polly loop nest optimizer, which targets two backend code generators: one for the ISA of a variable precision FP arithmetic coprocessor, and one for the MPFR multi-precision floating-point library. Our optimizing compilation flow targeting MPFR outperforms the Boost programming interface for the MPFR library by a factor of 1.80 × and 1.67 × in sequential execution of the Poly Bench and RAJAPerf suites, respectively, and by a factor of 7.62 x on an 8-core (and 16-thread) machine for RAJAPerf in OpenMP.

查看原文本刊更多论文

可变精度浮点运算的无缝编译集成

处理器中的浮点(FP)单元通常仅限于支持IEEE 754标准定义的格式子集。因此，用于高性能计算的高效语言和优化编译器只支持IEEE标准类型，而需要更高精度的应用程序涉及繁琐的内存管理和对外部库的调用，从而导致代码膨胀并使程序的意图不明确。我们提出了C类型系统的扩展，它可以表示泛型FP操作和格式，支持静态精度和动态可变精度。我们设计并实现了一个编译流，以弥合该类型系统与低级FP指令或软件库之间的抽象鸿沟。我们的解决方案的有效性通过基于LLVM的实现来证明，利用LLVM中的积极优化，包括Polly循环巢优化器，它针对两个后端代码生成器:一个用于可变精度FP算术协处理器的ISA，一个用于MPFR多精度浮点库。我们针对MPFR优化的编译流在Poly Bench和RAJAPerf套件的顺序执行中分别比MPFR库的Boost编程接口高出1.80倍和1.67倍，在OpenMP中的RAJAPerf的8核(16线程)机器上高出7.62倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)

自引率

0.00%

发文量