Bringing Compiling Databases to RISC Architectures

Proc. VLDB Endow. Pub Date : 2023-02-01 DOI:10.14778/3583140.3583142

F. Gruber, Maximilian Bandle, A. Engelke, Thomas Neumann, Jana Giceva

{"title":"Bringing Compiling Databases to RISC Architectures","authors":"F. Gruber, Maximilian Bandle, A. Engelke, Thomas Neumann, Jana Giceva","doi":"10.14778/3583140.3583142","DOIUrl":null,"url":null,"abstract":"Current hardware development greatly influences the design decisions of modern database systems. For many modern performance-focused database systems, query compilation emerged as an integral part and different approaches for code generation evolved, making use of standard compilers, general-purpose compiler libraries, or domain-specific code generators. However, development primarily focused on the dominating x86-64 server architecture; but neglected current hardware developments towards other CPU architectures like ARM and other RISC architectures.\n Therefore, we explore the design space of code generation in database systems considering a variety of state-of-the-art compilation approaches with a set of qualitative and quantitative metrics. Based on our findings, we have developed a new code generator called FireARM for AArch64-based systems in our database system, Umbra. We identify general as well as architecture-specific challenges for custom code generation in databases and provide potential solutions to abstract or handle them.\n Furthermore, we present an extensive evaluation of different compilation approaches in Umbra on a wide variety of x86-64 and ARM machines. In particular, we compare quantitative performance characteristics such as compilation latency and query throughput.\n Our results show that using standard languages and compiler infrastructures reduces the barrier to employing query compilation and allows for high performance on big data sets, while domain-specific code generators can achieve a significantly lower compilation overhead and allow for better targeting of new architectures.","PeriodicalId":20467,"journal":{"name":"Proc. VLDB Endow.","volume":"82 1","pages":"1222-1234"},"PeriodicalIF":0.0000,"publicationDate":"2023-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proc. VLDB Endow.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.14778/3583140.3583142","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

Abstract

Current hardware development greatly influences the design decisions of modern database systems. For many modern performance-focused database systems, query compilation emerged as an integral part and different approaches for code generation evolved, making use of standard compilers, general-purpose compiler libraries, or domain-specific code generators. However, development primarily focused on the dominating x86-64 server architecture; but neglected current hardware developments towards other CPU architectures like ARM and other RISC architectures. Therefore, we explore the design space of code generation in database systems considering a variety of state-of-the-art compilation approaches with a set of qualitative and quantitative metrics. Based on our findings, we have developed a new code generator called FireARM for AArch64-based systems in our database system, Umbra. We identify general as well as architecture-specific challenges for custom code generation in databases and provide potential solutions to abstract or handle them. Furthermore, we present an extensive evaluation of different compilation approaches in Umbra on a wide variety of x86-64 and ARM machines. In particular, we compare quantitative performance characteristics such as compilation latency and query throughput. Our results show that using standard languages and compiler infrastructures reduces the barrier to employing query compilation and allows for high performance on big data sets, while domain-specific code generators can achieve a significantly lower compilation overhead and allow for better targeting of new architectures.

查看原文本刊更多论文

将编译数据库引入RISC架构

当前硬件的发展极大地影响了现代数据库系统的设计决策。对于许多关注性能的现代数据库系统，查询编译作为一个不可分割的部分出现，不同的代码生成方法不断发展，使用标准编译器、通用编译器库或特定于领域的代码生成器。然而，开发主要集中在占主导地位的x86-64服务器架构上;但却忽视了当前其他CPU架构的硬件发展，比如ARM和其他RISC架构。因此，我们探索了数据库系统中代码生成的设计空间，考虑了各种最先进的编译方法和一组定性和定量指标。基于我们的发现，我们在数据库系统Umbra中为基于aarch64的系统开发了一个名为FireARM的新代码生成器。我们确定了数据库中自定义代码生成的一般挑战以及特定于体系结构的挑战，并提供了抽象或处理这些挑战的潜在解决方案。此外，我们在各种x86-64和ARM机器上对Umbra中的不同编译方法进行了广泛的评估。特别是，我们比较了诸如编译延迟和查询吞吐量之类的定量性能特征。我们的结果表明，使用标准语言和编译器基础设施减少了使用查询编译的障碍，并允许在大数据集上实现高性能，而特定领域的代码生成器可以实现显着降低的编译开销，并允许更好地针对新架构。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proc. VLDB Endow.

自引率

0.00%

发文量