{"title":"Sparse Deep Neural Network Inference Using Different Programming Models","authors":"Hyungro Lee, Milan Jain, Sayan Ghosh","doi":"10.1109/HPEC55821.2022.9926362","DOIUrl":null,"url":null,"abstract":"Sparse deep neural networks have gained increasing attention recently in achieving speedups on inference with reduced memory footprints. Real-world applications often have to deal with sparse data and irregularities in the computations, yet a wide variety of Deep Neural Network (DNN) tasks remain dense without exploiting the advantages of sparsity in networks. Recent works presented in MIT/IEEE/Amazon GraphChallenge have demonstrated significant speedups and various techniques. Still, we find that there is limited investigation of the impact of various Python and C/C++ based programming models to explore new opportunities for the general cases. In this work, we provide extensive quantitative evaluations of various contemporary GPGPU programming models such as Cupy, Cuda CUSPARSE, and Openmp in the context of Sparse Deep Neural Network (SPDNN) implementations (derived from the Graph Challenge reference serial code) on single and multiple GPUs from NVIDIA DGX-A100 40GB/80GB platforms.","PeriodicalId":200071,"journal":{"name":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC55821.2022.9926362","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Sparse deep neural networks have gained increasing attention recently in achieving speedups on inference with reduced memory footprints. Real-world applications often have to deal with sparse data and irregularities in the computations, yet a wide variety of Deep Neural Network (DNN) tasks remain dense without exploiting the advantages of sparsity in networks. Recent works presented in MIT/IEEE/Amazon GraphChallenge have demonstrated significant speedups and various techniques. Still, we find that there is limited investigation of the impact of various Python and C/C++ based programming models to explore new opportunities for the general cases. In this work, we provide extensive quantitative evaluations of various contemporary GPGPU programming models such as Cupy, Cuda CUSPARSE, and Openmp in the context of Sparse Deep Neural Network (SPDNN) implementations (derived from the Graph Challenge reference serial code) on single and multiple GPUs from NVIDIA DGX-A100 40GB/80GB platforms.