Vibhav Vineet, P. Harish, Suryakant Patidar, P J Narayanan
{"title":"快速最小生成树在GPU上的大图形","authors":"Vibhav Vineet, P. Harish, Suryakant Patidar, P J Narayanan","doi":"10.1145/1572769.1572796","DOIUrl":null,"url":null,"abstract":"Graphics Processor Units are used for many general purpose processing due to high compute power available on them. Regular, data-parallel algorithms map well to the SIMD architecture of current GPU. Irregular algorithms on discrete structures like graphs are harder to map to them. Efficient data-mapping primitives can play crucial role in mapping such algorithms onto the GPU. In this paper, we present a minimum spanning tree algorithm on Nvidia GPUs under CUDA, as a recursive formulation of Borůvka's approach for undirected graphs. We implement it using scalable primitives such as scan, segmented scan and split. The irregular steps of supervertex formation and recursive graph construction are mapped to primitives like split to categories involving vertex ids and edge weights. We obtain 30 to 50 times speedup over the CPU implementation on most graphs and 3 to 10 times speedup over our previous GPU implementation. We construct the minimum spanning tree on a 5 million node and 30 million edge graph in under 1 second on one quarter of the Tesla S1070 GPU.","PeriodicalId":163044,"journal":{"name":"Proceedings of the Conference on High Performance Graphics 2009","volume":"347 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"126","resultStr":"{\"title\":\"Fast minimum spanning tree for large graphs on the GPU\",\"authors\":\"Vibhav Vineet, P. Harish, Suryakant Patidar, P J Narayanan\",\"doi\":\"10.1145/1572769.1572796\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graphics Processor Units are used for many general purpose processing due to high compute power available on them. Regular, data-parallel algorithms map well to the SIMD architecture of current GPU. Irregular algorithms on discrete structures like graphs are harder to map to them. Efficient data-mapping primitives can play crucial role in mapping such algorithms onto the GPU. In this paper, we present a minimum spanning tree algorithm on Nvidia GPUs under CUDA, as a recursive formulation of Borůvka's approach for undirected graphs. We implement it using scalable primitives such as scan, segmented scan and split. The irregular steps of supervertex formation and recursive graph construction are mapped to primitives like split to categories involving vertex ids and edge weights. We obtain 30 to 50 times speedup over the CPU implementation on most graphs and 3 to 10 times speedup over our previous GPU implementation. We construct the minimum spanning tree on a 5 million node and 30 million edge graph in under 1 second on one quarter of the Tesla S1070 GPU.\",\"PeriodicalId\":163044,\"journal\":{\"name\":\"Proceedings of the Conference on High Performance Graphics 2009\",\"volume\":\"347 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"126\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the Conference on High Performance Graphics 2009\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1572769.1572796\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Conference on High Performance Graphics 2009","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1572769.1572796","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fast minimum spanning tree for large graphs on the GPU
Graphics Processor Units are used for many general purpose processing due to high compute power available on them. Regular, data-parallel algorithms map well to the SIMD architecture of current GPU. Irregular algorithms on discrete structures like graphs are harder to map to them. Efficient data-mapping primitives can play crucial role in mapping such algorithms onto the GPU. In this paper, we present a minimum spanning tree algorithm on Nvidia GPUs under CUDA, as a recursive formulation of Borůvka's approach for undirected graphs. We implement it using scalable primitives such as scan, segmented scan and split. The irregular steps of supervertex formation and recursive graph construction are mapped to primitives like split to categories involving vertex ids and edge weights. We obtain 30 to 50 times speedup over the CPU implementation on most graphs and 3 to 10 times speedup over our previous GPU implementation. We construct the minimum spanning tree on a 5 million node and 30 million edge graph in under 1 second on one quarter of the Tesla S1070 GPU.