Compiled Query Execution Engine using JVM

22nd International Conference on Data Engineering (ICDE'06) Pub Date : 2006-04-03 DOI:10.1109/ICDE.2006.40

Jun Rao, H. Pirahesh, C. Mohan, G. Lohman

{"title":"Compiled Query Execution Engine using JVM","authors":"Jun Rao, H. Pirahesh, C. Mohan, G. Lohman","doi":"10.1109/ICDE.2006.40","DOIUrl":null,"url":null,"abstract":"A conventional query execution engine in a database system essentially uses a SQL virtual machine (SVM) to interpret a dataflow tree in which each node is associated with a relational operator. During query evaluation, a single tuple at a time is processed and passed among the operators. Such a model is popular because of its efficiency for pipelined processing. However, since each operator is implemented statically, it has to be very generic in order to deal with all possible queries. Such generality tends to introduce significant runtime inefficiency, especially in the context of memory-resident systems, because the granularity of data commercial system, using SVM. processing (a tuple) is too small compared with the associated overhead. Another disadvantage in such an engine is that each operator code is compiled statically, so query-specific optimization cannot be applied. To improve runtime efficiency, we propose a compiled execution engine, which, for a given query, generates new query-specific code on the fly, and then dynamically compiles and executes the code. The Java platform makes our approach particularly interesting for several reasons: (1) modern Java Virtual Machines (JVM) have Just- In-Time (JIT) compilers that optimize code at runtime based on the execution pattern, a key feature that SVMs lack; (2) because of Java’s continued popularity, JVMs keep improving at a faster pace than SVMs, allowing us to exploit new advances in the Java runtime in the future; (3) Java is a dynamic language, which makes it convenient to load a piece of new code on the fly. In this paper, we develop both an interpreted and a compiled query execution engine in a relational, Java-based, in-memory database prototype, and perform an experimental study. Our experimental results on the TPC-H data set show that, despite both engines benefiting from JIT, the compiled engine runs on average about twice as fast as the interpreted one, and significantly faster than an in-memory","PeriodicalId":6819,"journal":{"name":"22nd International Conference on Data Engineering (ICDE'06)","volume":"71 1","pages":"23-23"},"PeriodicalIF":0.0000,"publicationDate":"2006-04-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"72","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"22nd International Conference on Data Engineering (ICDE'06)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDE.2006.40","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 72

Abstract

A conventional query execution engine in a database system essentially uses a SQL virtual machine (SVM) to interpret a dataflow tree in which each node is associated with a relational operator. During query evaluation, a single tuple at a time is processed and passed among the operators. Such a model is popular because of its efficiency for pipelined processing. However, since each operator is implemented statically, it has to be very generic in order to deal with all possible queries. Such generality tends to introduce significant runtime inefficiency, especially in the context of memory-resident systems, because the granularity of data commercial system, using SVM. processing (a tuple) is too small compared with the associated overhead. Another disadvantage in such an engine is that each operator code is compiled statically, so query-specific optimization cannot be applied. To improve runtime efficiency, we propose a compiled execution engine, which, for a given query, generates new query-specific code on the fly, and then dynamically compiles and executes the code. The Java platform makes our approach particularly interesting for several reasons: (1) modern Java Virtual Machines (JVM) have Just- In-Time (JIT) compilers that optimize code at runtime based on the execution pattern, a key feature that SVMs lack; (2) because of Java’s continued popularity, JVMs keep improving at a faster pace than SVMs, allowing us to exploit new advances in the Java runtime in the future; (3) Java is a dynamic language, which makes it convenient to load a piece of new code on the fly. In this paper, we develop both an interpreted and a compiled query execution engine in a relational, Java-based, in-memory database prototype, and perform an experimental study. Our experimental results on the TPC-H data set show that, despite both engines benefiting from JIT, the compiled engine runs on average about twice as fast as the interpreted one, and significantly faster than an in-memory

查看原文本刊更多论文

使用JVM的编译查询执行引擎

数据库系统中的传统查询执行引擎本质上使用SQL虚拟机(SVM)来解释数据流树，其中每个节点都与一个关系操作符相关联。在查询求值期间，每次处理一个元组，并在操作符之间传递。这种模型因其对流水线处理的效率而广受欢迎。但是，由于每个操作符都是静态实现的，因此它必须非常通用，以便处理所有可能的查询。这种通用性往往会引入显著的运行时效率低下，特别是在内存驻留系统的上下文中，因为使用支持向量机的商业系统的数据粒度。与相关的开销相比，处理(元组)太小了。这种引擎的另一个缺点是，每个操作符代码都是静态编译的，因此无法应用特定于查询的优化。为了提高运行时效率，我们提出了一个编译执行引擎，对于给定的查询，它动态地生成新的特定于查询的代码，然后动态地编译和执行代码。Java平台使我们的方法特别有趣，原因如下:(1)现代Java虚拟机(JVM)具有即时(JIT)编译器，它基于执行模式在运行时优化代码，这是svm所缺乏的一个关键特性;(2)由于Java的持续流行，jvm以比svm更快的速度不断改进，使我们能够在未来利用Java运行时的新进展;(3) Java是一种动态语言，这使得动态加载一段新代码非常方便。在本文中，我们在一个基于java的关系型内存数据库原型中开发了一个解释型和编译型查询执行引擎，并进行了实验研究。我们在TPC-H数据集上的实验结果表明，尽管两个引擎都受益于JIT，但编译引擎的平均运行速度是解释引擎的两倍，并且明显快于内存中的引擎

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

22nd International Conference on Data Engineering (ICDE'06)

自引率

0.00%

发文量