Seamless Compiler Integration of Variable Precision Floating-Point Arithmetic

T. Jost, Y. Durand, Christian Fabre, Albert Cohen, F. Pétrot
{"title":"Seamless Compiler Integration of Variable Precision Floating-Point Arithmetic","authors":"T. Jost, Y. Durand, Christian Fabre, Albert Cohen, F. Pétrot","doi":"10.1109/CGO51591.2021.9370331","DOIUrl":null,"url":null,"abstract":"Floating-Point (FP) units in processors are generally limited to supporting a subset of formats defined by the IEEE 754 standard. As a result, high-efficiency languages and optimizing compilers for high-performance computing only support IEEE standard types and applications needing higher precision involve cumbersome memory management and calls to external libraries, resulting in code bloat and making the intent of the program unclear. We present an extension of the C type system that can represent generic FP operations and formats, supporting both static precision and dynamically variable precision. We design and implement a compilation flow bridging the abstraction gap between this type system and low-level FP instructions or software libraries. The effectiveness of our solution is demonstrated through an LLVM-based implementation, leveraging aggressive optimizations in LLVM including the Polly loop nest optimizer, which targets two backend code generators: one for the ISA of a variable precision FP arithmetic coprocessor, and one for the MPFR multi-precision floating-point library. Our optimizing compilation flow targeting MPFR outperforms the Boost programming interface for the MPFR library by a factor of 1.80 × and 1.67 × in sequential execution of the Poly Bench and RAJAPerf suites, respectively, and by a factor of 7.62 x on an 8-core (and 16-thread) machine for RAJAPerf in OpenMP.","PeriodicalId":275062,"journal":{"name":"2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)","volume":"45 3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM International Symposium on Code Generation and Optimization (CGO)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CGO51591.2021.9370331","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

Floating-Point (FP) units in processors are generally limited to supporting a subset of formats defined by the IEEE 754 standard. As a result, high-efficiency languages and optimizing compilers for high-performance computing only support IEEE standard types and applications needing higher precision involve cumbersome memory management and calls to external libraries, resulting in code bloat and making the intent of the program unclear. We present an extension of the C type system that can represent generic FP operations and formats, supporting both static precision and dynamically variable precision. We design and implement a compilation flow bridging the abstraction gap between this type system and low-level FP instructions or software libraries. The effectiveness of our solution is demonstrated through an LLVM-based implementation, leveraging aggressive optimizations in LLVM including the Polly loop nest optimizer, which targets two backend code generators: one for the ISA of a variable precision FP arithmetic coprocessor, and one for the MPFR multi-precision floating-point library. Our optimizing compilation flow targeting MPFR outperforms the Boost programming interface for the MPFR library by a factor of 1.80 × and 1.67 × in sequential execution of the Poly Bench and RAJAPerf suites, respectively, and by a factor of 7.62 x on an 8-core (and 16-thread) machine for RAJAPerf in OpenMP.
可变精度浮点运算的无缝编译集成
处理器中的浮点(FP)单元通常仅限于支持IEEE 754标准定义的格式子集。因此,用于高性能计算的高效语言和优化编译器只支持IEEE标准类型,而需要更高精度的应用程序涉及繁琐的内存管理和对外部库的调用,从而导致代码膨胀并使程序的意图不明确。我们提出了C类型系统的扩展,它可以表示泛型FP操作和格式,支持静态精度和动态可变精度。我们设计并实现了一个编译流,以弥合该类型系统与低级FP指令或软件库之间的抽象鸿沟。我们的解决方案的有效性通过基于LLVM的实现来证明,利用LLVM中的积极优化,包括Polly循环巢优化器,它针对两个后端代码生成器:一个用于可变精度FP算术协处理器的ISA,一个用于MPFR多精度浮点库。我们针对MPFR优化的编译流在Poly Bench和RAJAPerf套件的顺序执行中分别比MPFR库的Boost编程接口高出1.80倍和1.67倍,在OpenMP中的RAJAPerf的8核(16线程)机器上高出7.62倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信