Comparing the Performance of Julia on CPUs versus GPUs and Julia-MPI versus Fortran-MPI: a case study with MPAS-Ocean (Version 7.1)

IF 4 3区地球科学 Q1 GEOSCIENCES, MULTIDISCIPLINARY

Geoscientific Model Development Pub Date : 2023-10-05 DOI:10.5194/gmd-16-5539-2023

Siddhartha Bishnu, Robert R. Strauss, Mark R. Petersen

{"title":"Comparing the Performance of Julia on CPUs versus GPUs and Julia-MPI versus Fortran-MPI: a case study with MPAS-Ocean (Version 7.1)","authors":"Siddhartha Bishnu, Robert R. Strauss, Mark R. Petersen","doi":"10.5194/gmd-16-5539-2023","DOIUrl":null,"url":null,"abstract":"Abstract. Some programming languages are easy to develop at the cost of slow execution, while others are fast at runtime but much more difficult to write. Julia is a programming language that aims to be the best of both worlds – a development and production language at the same time. To test Julia's utility in scientific high-performance computing (HPC), we built an unstructured-mesh shallow water model in Julia and compared it against an established Fortran-MPI ocean model, the Model for Prediction Across Scales–Ocean (MPAS-Ocean), as well as a Python shallow water code. Three versions of the Julia shallow water code were created: for single-core CPU, graphics processing unit (GPU), and Message Passing Interface (MPI) CPU clusters. Comparing identical simulations revealed that our first version of the Julia model was 13 times faster than Python using NumPy, where both used an unthreaded single-core CPU. Further Julia optimizations, including static typing and removing implicit memory allocations, provided an additional 10–20× speed-up of the single-core CPU Julia model. The GPU-accelerated Julia code was almost identical in terms of performance to the MPI parallelized code on 64 processes, an unexpected result for such different architectures. Parallelized Julia-MPI performance was identical to Fortran-MPI MPAS-Ocean for low processor counts and ranges from 2× faster to 2× slower for higher processor counts. Our experience is that Julia development is fast and convenient for prototyping but that Julia requires further investment and expertise to be competitive with compiled codes. We provide advice on Julia code optimization for HPC systems.","PeriodicalId":12799,"journal":{"name":"Geoscientific Model Development","volume":"97 1","pages":"0"},"PeriodicalIF":4.0000,"publicationDate":"2023-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Geoscientific Model Development","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5194/gmd-16-5539-2023","RegionNum":3,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"GEOSCIENCES, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 1

Abstract

Abstract. Some programming languages are easy to develop at the cost of slow execution, while others are fast at runtime but much more difficult to write. Julia is a programming language that aims to be the best of both worlds – a development and production language at the same time. To test Julia's utility in scientific high-performance computing (HPC), we built an unstructured-mesh shallow water model in Julia and compared it against an established Fortran-MPI ocean model, the Model for Prediction Across Scales–Ocean (MPAS-Ocean), as well as a Python shallow water code. Three versions of the Julia shallow water code were created: for single-core CPU, graphics processing unit (GPU), and Message Passing Interface (MPI) CPU clusters. Comparing identical simulations revealed that our first version of the Julia model was 13 times faster than Python using NumPy, where both used an unthreaded single-core CPU. Further Julia optimizations, including static typing and removing implicit memory allocations, provided an additional 10–20× speed-up of the single-core CPU Julia model. The GPU-accelerated Julia code was almost identical in terms of performance to the MPI parallelized code on 64 processes, an unexpected result for such different architectures. Parallelized Julia-MPI performance was identical to Fortran-MPI MPAS-Ocean for low processor counts and ranges from 2× faster to 2× slower for higher processor counts. Our experience is that Julia development is fast and convenient for prototyping but that Julia requires further investment and expertise to be competitive with compiled codes. We provide advice on Julia code optimization for HPC systems.

查看原文本刊更多论文

比较Julia在cpu和gpu上的性能以及Julia- mpi和Fortran-MPI上的性能:MPAS-Ocean (Version 7.1)的案例研究

摘要一些编程语言很容易开发，但代价是执行缓慢，而另一些编程语言在运行时速度很快，但编写起来要困难得多。Julia是一种编程语言，旨在成为两个世界中最好的—同时是开发和生产语言。为了测试Julia在科学高性能计算(HPC)中的实用性，我们在Julia中构建了一个非结构化网格浅水模型，并将其与已建立的Fortran-MPI海洋模型、跨尺度预测模型-海洋(MPAS-Ocean)以及Python浅水代码进行了比较。Julia浅水代码创建了三个版本:单核CPU、图形处理单元(GPU)和消息传递接口(MPI) CPU集群。通过比较相同的模拟，我们发现Julia模型的第一个版本比使用NumPy的Python快13倍，两者都使用非线程单核CPU。进一步的Julia优化，包括静态类型和删除隐式内存分配，为单核CPU Julia模型提供了10 - 20倍的额外速度提升。gpu加速的Julia代码在性能方面几乎与64进程上的MPI并行代码相同，这对于如此不同的体系结构来说是一个意想不到的结果。在处理器数量较少的情况下，并行化的Julia-MPI性能与Fortran-MPI MPAS-Ocean性能相同，在处理器数量较多的情况下，性能从快2倍到慢2倍不等。我们的经验是，Julia开发对于原型来说是快速和方便的，但是Julia需要进一步的投资和专业知识才能与编译代码竞争。我们为高性能计算系统提供Julia代码优化建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Geoscientific Model Development GEOSCIENCES, MULTIDISCIPLINARY-

CiteScore

8.60

自引率

9.80%

发文量

352

审稿时长

6-12 weeks

期刊介绍： Geoscientific Model Development (GMD) is an international scientific journal dedicated to the publication and public discussion of the description, development, and evaluation of numerical models of the Earth system and its components. The following manuscript types can be considered for peer-reviewed publication: * geoscientific model descriptions, from statistical models to box models to GCMs; * development and technical papers, describing developments such as new parameterizations or technical aspects of running models such as the reproducibility of results; * new methods for assessment of models, including work on developing new metrics for assessing model performance and novel ways of comparing model results with observational data; * papers describing new standard experiments for assessing model performance or novel ways of comparing model results with observational data; * model experiment descriptions, including experimental details and project protocols; * full evaluations of previously published models.