Huma Milisic, Dina Ahmic, Hamdija Sinanovic, E. Saric, Amar Asotic, Alvin Huseinović
{"title":"基于CUDA平台的密集图上BFS遍历的并行化挑战","authors":"Huma Milisic, Dina Ahmic, Hamdija Sinanovic, E. Saric, Amar Asotic, Alvin Huseinović","doi":"10.1109/BIHTEL.2016.7775712","DOIUrl":null,"url":null,"abstract":"This paper presents challenges encountered while parallelizing an existing sequential algorithm. A breadth-first search implementation in CUDA C++ of quadratic time complexity is used. Even though BFS might seem like an easily parallelizable problem due to many independent iterations over graph vertices, there are other important aspects which need to be considered. Properties like granulation, communication and load balancing are thoroughly examined to find their exact impact in BFS. The implementation is profiled to find bottlenecks and problems which prevent its further optimization.","PeriodicalId":156236,"journal":{"name":"2016 XI International Symposium on Telecommunications (BIHTEL)","volume":"52 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Parallelization challenges of BFS traversal on dense graphs using the CUDA platform\",\"authors\":\"Huma Milisic, Dina Ahmic, Hamdija Sinanovic, E. Saric, Amar Asotic, Alvin Huseinović\",\"doi\":\"10.1109/BIHTEL.2016.7775712\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents challenges encountered while parallelizing an existing sequential algorithm. A breadth-first search implementation in CUDA C++ of quadratic time complexity is used. Even though BFS might seem like an easily parallelizable problem due to many independent iterations over graph vertices, there are other important aspects which need to be considered. Properties like granulation, communication and load balancing are thoroughly examined to find their exact impact in BFS. The implementation is profiled to find bottlenecks and problems which prevent its further optimization.\",\"PeriodicalId\":156236,\"journal\":{\"name\":\"2016 XI International Symposium on Telecommunications (BIHTEL)\",\"volume\":\"52 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 XI International Symposium on Telecommunications (BIHTEL)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/BIHTEL.2016.7775712\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 XI International Symposium on Telecommunications (BIHTEL)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/BIHTEL.2016.7775712","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parallelization challenges of BFS traversal on dense graphs using the CUDA platform
This paper presents challenges encountered while parallelizing an existing sequential algorithm. A breadth-first search implementation in CUDA C++ of quadratic time complexity is used. Even though BFS might seem like an easily parallelizable problem due to many independent iterations over graph vertices, there are other important aspects which need to be considered. Properties like granulation, communication and load balancing are thoroughly examined to find their exact impact in BFS. The implementation is profiled to find bottlenecks and problems which prevent its further optimization.