{"title":"Powering Practical Performance: Accelerated Numerical Computing in Pure Python","authors":"Matthew Penn, Chris Milroy","doi":"10.1109/HPEC55821.2022.9926309","DOIUrl":null,"url":null,"abstract":"In this paper, we tackle a generic n-dimensional numerical computing problem to compare performance and analyze tradeoffs between popular frameworks using open source Jupyter notebook examples. Most data science practitioners perform their work in Python because of its high-level abstraction and rich set of numerical computing libraries. However, the choice of library and methodology is driven by complexity-impacting constraints like problem size, latency, memory, physical size, weight, power, hardware, and others. To that end, we demonstrate that a wide selection of GPU-accelerated libraries (RAPIDS, CuPy, Numba, Dask), including the development of hand-tuned CUDA kernels, are accessible to data scientists without ever leaving Python. We address the Python developer community by showing C/C++ is not necessary to access single/multi-GPU acceleration for data science applications. We solve a common numerical computing problem - finding the closest point in array B from every point (and its index) in array A, requiring up to 8.8 trillion distance comparisons - on a GPU-equipped workstation without writing a line of C/C++.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC55821.2022.9926309","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In this paper, we tackle a generic n-dimensional numerical computing problem to compare performance and analyze tradeoffs between popular frameworks using open source Jupyter notebook examples. Most data science practitioners perform their work in Python because of its high-level abstraction and rich set of numerical computing libraries. However, the choice of library and methodology is driven by complexity-impacting constraints like problem size, latency, memory, physical size, weight, power, hardware, and others. To that end, we demonstrate that a wide selection of GPU-accelerated libraries (RAPIDS, CuPy, Numba, Dask), including the development of hand-tuned CUDA kernels, are accessible to data scientists without ever leaving Python. We address the Python developer community by showing C/C++ is not necessary to access single/multi-GPU acceleration for data science applications. We solve a common numerical computing problem - finding the closest point in array B from every point (and its index) in array A, requiring up to 8.8 trillion distance comparisons - on a GPU-equipped workstation without writing a line of C/C++.