Bamshad: A JIT compiler for running Java stream APIs on heterogeneous environments

2017 19th International Symposium on Computer Architecture and Digital Systems (CADS) Pub Date : 2017-12-01 DOI:10.1109/CADS.2017.8310734

Bahram Yarahmadi, F. Khunjush

{"title":"Bamshad: A JIT compiler for running Java stream APIs on heterogeneous environments","authors":"Bahram Yarahmadi, F. Khunjush","doi":"10.1109/CADS.2017.8310734","DOIUrl":null,"url":null,"abstract":"Nowadays, Graphics Processing Units (GPUs) and other types of emerging accelerators have an important role in high-performance computing. These devices can be leveraged in a wide range of applications through using appropriate programming environments such as CUDA and OpenCL which lead to reaching high-performance applications. However, on one hand, programming GPUs is a painful and error-prone task and requires a great amount of expertise especially in low-level architectural features as well as their memory management in order to achieve reasonable performance. On the other hand, enabling running high-level programming languages such as Java with massive computational power of today's GPUs can lessen the burden of this complexity. Considering new features in Java 8 such as lambda functions which are used in Java parallel streams, supporting these new features is vital to use these devices in real applications. In this paper, we introduce a just-in-time compiler, named Bamshad, which ports lambda functions used in Java parallel streams to GPU at runtime. For this, a series of compiler techniques are adopted to transparently eliminate unnecessary data communication between CPUs and GPUs. With our approach, a programmer is not involved in the detailed process of tuning a GPU device for reducing the amount of communication. The experimental results show that the proposed JIT compiler yields 13× improvement in comparison to sequential Java execution for all benchmarks. Also, in comparison to parallel Java, our work yields 3.9× improvement.","PeriodicalId":321346,"journal":{"name":"2017 19th International Symposium on Computer Architecture and Digital Systems (CADS)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 19th International Symposium on Computer Architecture and Digital Systems (CADS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CADS.2017.8310734","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Nowadays, Graphics Processing Units (GPUs) and other types of emerging accelerators have an important role in high-performance computing. These devices can be leveraged in a wide range of applications through using appropriate programming environments such as CUDA and OpenCL which lead to reaching high-performance applications. However, on one hand, programming GPUs is a painful and error-prone task and requires a great amount of expertise especially in low-level architectural features as well as their memory management in order to achieve reasonable performance. On the other hand, enabling running high-level programming languages such as Java with massive computational power of today's GPUs can lessen the burden of this complexity. Considering new features in Java 8 such as lambda functions which are used in Java parallel streams, supporting these new features is vital to use these devices in real applications. In this paper, we introduce a just-in-time compiler, named Bamshad, which ports lambda functions used in Java parallel streams to GPU at runtime. For this, a series of compiler techniques are adopted to transparently eliminate unnecessary data communication between CPUs and GPUs. With our approach, a programmer is not involved in the detailed process of tuning a GPU device for reducing the amount of communication. The experimental results show that the proposed JIT compiler yields 13× improvement in comparison to sequential Java execution for all benchmarks. Also, in comparison to parallel Java, our work yields 3.9× improvement.

查看原文本刊更多论文

Bamshad:用于在异构环境中运行Java流api的JIT编译器

如今，图形处理单元(gpu)和其他类型的新兴加速器在高性能计算中发挥着重要作用。通过使用适当的编程环境(如CUDA和OpenCL)，这些设备可以在广泛的应用程序中得到利用，从而实现高性能应用程序。然而，一方面，编程gpu是一项痛苦且容易出错的任务，并且需要大量的专业知识，特别是在底层架构特性以及内存管理方面，以实现合理的性能。另一方面，启用高级编程语言(如Java)和当今gpu的强大计算能力可以减轻这种复杂性的负担。考虑到Java 8中的新特性，例如Java并行流中使用的lambda函数，支持这些新特性对于在实际应用程序中使用这些设备至关重要。在本文中，我们介绍了一个名为Bamshad的即时编译器，它在运行时将Java并行流中使用的lambda函数移植到GPU。为此，采用了一系列编译技术，透明地消除了cpu和gpu之间不必要的数据通信。使用我们的方法，程序员不需要参与调优GPU设备以减少通信量的详细过程。实验结果表明，在所有基准测试中，与顺序Java执行相比，所提出的JIT编译器的性能提高了13倍。此外，与并行Java相比，我们的工作产生了3.9倍的改进。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2017 19th International Symposium on Computer Architecture and Digital Systems (CADS)

自引率

0.00%

发文量