{"title":"基于gpu的并行计算编程模型,具有可扩展性和抽象性","authors":"B. Domonkos, G. Jakab","doi":"10.1145/1980462.1980484","DOIUrl":null,"url":null,"abstract":"In this paper, we present a multi-level programming model for recent GPU-based high performance computing systems. Involving cooperative stream threads and symmetric multiprocessing threads our model gives a computational framework that scales through multi-GPU environments to GPU-cluster systems. Instead of hiding the execution environment from the programmer using compiler extensions or metaprogramming techniques we aim a solution that both enables optimizations and provides abstract problem space mapping with code reusability and virtualization of hardware resources in order to decrease the programming effort. We evaluate an implementation of our model based on CUDA, OpenMP, and MPI2 technologies on a complex practical application scenario and discuss its performance scaling behavior.","PeriodicalId":235681,"journal":{"name":"Spring conference on Computer graphics","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"A programming model for GPU-based parallel computing with scalability and abstraction\",\"authors\":\"B. Domonkos, G. Jakab\",\"doi\":\"10.1145/1980462.1980484\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we present a multi-level programming model for recent GPU-based high performance computing systems. Involving cooperative stream threads and symmetric multiprocessing threads our model gives a computational framework that scales through multi-GPU environments to GPU-cluster systems. Instead of hiding the execution environment from the programmer using compiler extensions or metaprogramming techniques we aim a solution that both enables optimizations and provides abstract problem space mapping with code reusability and virtualization of hardware resources in order to decrease the programming effort. We evaluate an implementation of our model based on CUDA, OpenMP, and MPI2 technologies on a complex practical application scenario and discuss its performance scaling behavior.\",\"PeriodicalId\":235681,\"journal\":{\"name\":\"Spring conference on Computer graphics\",\"volume\":\"37 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-04-23\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Spring conference on Computer graphics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/1980462.1980484\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Spring conference on Computer graphics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/1980462.1980484","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
A programming model for GPU-based parallel computing with scalability and abstraction
In this paper, we present a multi-level programming model for recent GPU-based high performance computing systems. Involving cooperative stream threads and symmetric multiprocessing threads our model gives a computational framework that scales through multi-GPU environments to GPU-cluster systems. Instead of hiding the execution environment from the programmer using compiler extensions or metaprogramming techniques we aim a solution that both enables optimizations and provides abstract problem space mapping with code reusability and virtualization of hardware resources in order to decrease the programming effort. We evaluate an implementation of our model based on CUDA, OpenMP, and MPI2 technologies on a complex practical application scenario and discuss its performance scaling behavior.