Huahai Liu, Pan Wang, Kewen Wang, Xun Cai, L. Zeng, Sikun Li
{"title":"Scalable Multi-GPU Decoupled Parallel Rendering Approach in Shared Memory Architecture","authors":"Huahai Liu, Pan Wang, Kewen Wang, Xun Cai, L. Zeng, Sikun Li","doi":"10.1109/ICVRV.2011.46","DOIUrl":null,"url":null,"abstract":"As the performance-price ratio of the GPU becomes higher, lots of systems are able to accommodate more than one GPU in node. Each GPU in node can afford powerful rendering ability. It is very important to effectively organize parallel rendering pipeline to fully exploit the compute units of the system. But lots of parallel rendering systems usually join hardware rendering stage with composition stage in the display thread and this frequently leads to GPU stall. In this paper, we describe a decoupled parallel rendering approach and enable the two stages to execute in parallel. With the frame buffer in the main memory, the full image rendering time is totally decided by the GPU rendering ability when the rendering task is large enough. Theoretical analysis and experiment results both evidence that the performance of our method is much better than the coupled parallel rendering method. We also test the scalability of the approach and get a linear performance speedup with the GPU number when the rendering task is large enough. The approach is easy to be implemented and any parallel rendering application can benefit from it.","PeriodicalId":239933,"journal":{"name":"2011 International Conference on Virtual Reality and Visualization","volume":"187 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-11-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 International Conference on Virtual Reality and Visualization","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICVRV.2011.46","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
As the performance-price ratio of the GPU becomes higher, lots of systems are able to accommodate more than one GPU in node. Each GPU in node can afford powerful rendering ability. It is very important to effectively organize parallel rendering pipeline to fully exploit the compute units of the system. But lots of parallel rendering systems usually join hardware rendering stage with composition stage in the display thread and this frequently leads to GPU stall. In this paper, we describe a decoupled parallel rendering approach and enable the two stages to execute in parallel. With the frame buffer in the main memory, the full image rendering time is totally decided by the GPU rendering ability when the rendering task is large enough. Theoretical analysis and experiment results both evidence that the performance of our method is much better than the coupled parallel rendering method. We also test the scalability of the approach and get a linear performance speedup with the GPU number when the rendering task is large enough. The approach is easy to be implemented and any parallel rendering application can benefit from it.