Huma Milisic, Dina Ahmic, Hamdija Sinanovic, E. Saric, Amar Asotic, Alvin Huseinović
{"title":"Parallelization challenges of BFS traversal on dense graphs using the CUDA platform","authors":"Huma Milisic, Dina Ahmic, Hamdija Sinanovic, E. Saric, Amar Asotic, Alvin Huseinović","doi":"10.1109/BIHTEL.2016.7775712","DOIUrl":null,"url":null,"abstract":"This paper presents challenges encountered while parallelizing an existing sequential algorithm. A breadth-first search implementation in CUDA C++ of quadratic time complexity is used. Even though BFS might seem like an easily parallelizable problem due to many independent iterations over graph vertices, there are other important aspects which need to be considered. Properties like granulation, communication and load balancing are thoroughly examined to find their exact impact in BFS. The implementation is profiled to find bottlenecks and problems which prevent its further optimization.","PeriodicalId":156236,"journal":{"name":"2016 XI International Symposium on Telecommunications (BIHTEL)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 XI International Symposium on Telecommunications (BIHTEL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIHTEL.2016.7775712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents challenges encountered while parallelizing an existing sequential algorithm. A breadth-first search implementation in CUDA C++ of quadratic time complexity is used. Even though BFS might seem like an easily parallelizable problem due to many independent iterations over graph vertices, there are other important aspects which need to be considered. Properties like granulation, communication and load balancing are thoroughly examined to find their exact impact in BFS. The implementation is profiled to find bottlenecks and problems which prevent its further optimization.