Vishwesh Jatala, Roshan Dathathri, G. Gill, Loc Hoang, V. K. Nandivada, K. Pingali
{"title":"A Study of Graph Analytics for Massive Datasets on Distributed Multi-GPUs","authors":"Vishwesh Jatala, Roshan Dathathri, G. Gill, Loc Hoang, V. K. Nandivada, K. Pingali","doi":"10.1109/IPDPS47924.2020.00019","DOIUrl":null,"url":null,"abstract":"There are relatively few studies of distributed GPU graph analytics systems in the literature and they are limited in scope since they deal with small data-sets, consider only a few applications, and do not consider the interplay between partitioning policies and optimizations for computation and communication.In this paper, we present the first detailed analysis of graph analytics applications for massive real-world datasets on a distributed multi-GPU platform and the first analysis of strong scaling of smaller real-world datasets. We use D-IrGL, the state-of-the-art distributed GPU graph analytical framework, in our study. Our evaluation shows that (1) the Cartesian vertex-cut partitioning policy is critical to scale computation out on GPUs even at a small scale, (2) static load imbalance is a key factor in performance since memory is limited on GPUs, (3) device-host communication is a significant portion of execution time and should be optimized to gain performance, and (4) asynchronous execution is not always better than bulk-synchronous execution.","PeriodicalId":6805,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"32 1","pages":"84-94"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPS47924.2020.00019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 9
Abstract
There are relatively few studies of distributed GPU graph analytics systems in the literature and they are limited in scope since they deal with small data-sets, consider only a few applications, and do not consider the interplay between partitioning policies and optimizations for computation and communication.In this paper, we present the first detailed analysis of graph analytics applications for massive real-world datasets on a distributed multi-GPU platform and the first analysis of strong scaling of smaller real-world datasets. We use D-IrGL, the state-of-the-art distributed GPU graph analytical framework, in our study. Our evaluation shows that (1) the Cartesian vertex-cut partitioning policy is critical to scale computation out on GPUs even at a small scale, (2) static load imbalance is a key factor in performance since memory is limited on GPUs, (3) device-host communication is a significant portion of execution time and should be optimized to gain performance, and (4) asynchronous execution is not always better than bulk-synchronous execution.