Fast and Highly Optimizing Separate Compilation for Automatic Parallelization

2019 International Conference on High Performance Computing & Simulation (HPCS) Pub Date : 2019-07-01 DOI:10.1109/HPCS48598.2019.9188148

Tohma Kawasumi, Ryota Tamura, Yuya Asada, Jixin Han, Hiroki Mikami, K. Kimura, H. Kasahara

{"title":"Fast and Highly Optimizing Separate Compilation for Automatic Parallelization","authors":"Tohma Kawasumi, Ryota Tamura, Yuya Asada, Jixin Han, Hiroki Mikami, K. Kimura, H. Kasahara","doi":"10.1109/HPCS48598.2019.9188148","DOIUrl":null,"url":null,"abstract":"Automatic parallelization by a compiler is a promising approach for fully utilizing a multicore processor. Without compiler support, a programmer must simultaneously take into account parallelism in a program and memory hierarchy utilization. However, the possibility of parallelization and optimization across multiple compilation units is limited due to the lack of interprocedural analysis information at the compile time. This is a serious challenge surrounding parallelizing practical programs because they usually consist of multiple compilation units and employ separate compilation to ensure program maintainability and reduce the recompilation time. In this paper, for automatic parallelization by a compiler, we propose a separate compilation method that enables parallelization across multiple compilation units and minimizes recompilation time by providing information about the analysis along with an object file for each compilation unit at the compile time. We also propose an automatically parallelizing compilation flow with analysis information. The experimental evaluation using large size real control system programs from industry shows the proposed technique can obtain 29% better performance than the separate compilation without the proposed method, and reduce compilation time by up to 90% with only 1% of performance loss, compared with the compilation for the fully unified source code into a single compilation unit.","PeriodicalId":371856,"journal":{"name":"2019 International Conference on High Performance Computing & Simulation (HPCS)","volume":"24 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCS48598.2019.9188148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Automatic parallelization by a compiler is a promising approach for fully utilizing a multicore processor. Without compiler support, a programmer must simultaneously take into account parallelism in a program and memory hierarchy utilization. However, the possibility of parallelization and optimization across multiple compilation units is limited due to the lack of interprocedural analysis information at the compile time. This is a serious challenge surrounding parallelizing practical programs because they usually consist of multiple compilation units and employ separate compilation to ensure program maintainability and reduce the recompilation time. In this paper, for automatic parallelization by a compiler, we propose a separate compilation method that enables parallelization across multiple compilation units and minimizes recompilation time by providing information about the analysis along with an object file for each compilation unit at the compile time. We also propose an automatically parallelizing compilation flow with analysis information. The experimental evaluation using large size real control system programs from industry shows the proposed technique can obtain 29% better performance than the separate compilation without the proposed method, and reduce compilation time by up to 90% with only 1% of performance loss, compared with the compilation for the fully unified source code into a single compilation unit.

查看原文本刊更多论文

用于自动并行化的快速和高度优化的独立编译

编译器的自动并行化是充分利用多核处理器的一种很有前途的方法。没有编译器的支持，程序员必须同时考虑程序中的并行性和内存层次结构的利用。然而，由于在编译时缺乏过程间分析信息，跨多个编译单元的并行化和优化的可能性受到限制。这是围绕并行化实际程序的一个严重挑战，因为它们通常由多个编译单元组成，并使用单独的编译来确保程序的可维护性并减少重新编译时间。在本文中，对于编译器的自动并行化，我们提出了一种单独的编译方法，该方法支持跨多个编译单元的并行化，并通过在编译时为每个编译单元提供有关分析的信息以及目标文件来最小化重新编译时间。我们还提出了一个带有分析信息的自动并行编译流程。利用大型工业实控系统程序进行的实验评估表明，与将完全统一的源代码编译到单个编译单元中相比，采用该方法进行单独编译的性能提高了29%，编译时间缩短了90%，而性能损失仅为1%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2019 International Conference on High Performance Computing & Simulation (HPCS)

自引率

0.00%

发文量