{"title":"使用CUDA在GPU上进行高性能计算和模拟","authors":"M. Ujaldón","doi":"10.1109/HPCSim.2012.6266884","DOIUrl":null,"url":null,"abstract":"The computational power and memory bandwidth of graphics processing units (GPUs) have turned them into attractive platforms for general-purpose applications at significant speed gains versus their CPU counterparts [1]. In addition, an increasing number of today's state-of-the-art supercomputers include commodity GPUs to bring us unprecedented levels of performance in terms of raw GFLOPS and GFLOPS/cost. In this paper, we provide an introduction to CUDA programming paradigm with an emphasis on simulations which can exploit SIMD parallelism and high memory bandwidth on GPUs. OpenCL is also briefly described as a recent standardization effort to set up an open standard API for general-purpose manycore architectures.","PeriodicalId":428764,"journal":{"name":"2012 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":"{\"title\":\"High performance computing and simulations on the GPU using CUDA\",\"authors\":\"M. Ujaldón\",\"doi\":\"10.1109/HPCSim.2012.6266884\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The computational power and memory bandwidth of graphics processing units (GPUs) have turned them into attractive platforms for general-purpose applications at significant speed gains versus their CPU counterparts [1]. In addition, an increasing number of today's state-of-the-art supercomputers include commodity GPUs to bring us unprecedented levels of performance in terms of raw GFLOPS and GFLOPS/cost. In this paper, we provide an introduction to CUDA programming paradigm with an emphasis on simulations which can exploit SIMD parallelism and high memory bandwidth on GPUs. OpenCL is also briefly described as a recent standardization effort to set up an open standard API for general-purpose manycore architectures.\",\"PeriodicalId\":428764,\"journal\":{\"name\":\"2012 International Conference on High Performance Computing & Simulation (HPCS)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"12\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 International Conference on High Performance Computing & Simulation (HPCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HPCSim.2012.6266884\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 International Conference on High Performance Computing & Simulation (HPCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCSim.2012.6266884","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
High performance computing and simulations on the GPU using CUDA
The computational power and memory bandwidth of graphics processing units (GPUs) have turned them into attractive platforms for general-purpose applications at significant speed gains versus their CPU counterparts [1]. In addition, an increasing number of today's state-of-the-art supercomputers include commodity GPUs to bring us unprecedented levels of performance in terms of raw GFLOPS and GFLOPS/cost. In this paper, we provide an introduction to CUDA programming paradigm with an emphasis on simulations which can exploit SIMD parallelism and high memory bandwidth on GPUs. OpenCL is also briefly described as a recent standardization effort to set up an open standard API for general-purpose manycore architectures.