一种新的gpu计算模型及其在I/O优化排序算法中的应用

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI:10.1109/IPDPSW.2014.72

A. Koike, K. Sadakane

{"title":"一种新的gpu计算模型及其在I/O优化排序算法中的应用","authors":"A. Koike, K. Sadakane","doi":"10.1109/IPDPSW.2014.72","DOIUrl":null,"url":null,"abstract":"We propose a novel computational model for GPU. Known parallel computational models such as the PRAM model are not appropriate for evaluating GPU algorithms. Our model, called AGPU, abstracts the essence of current GPU architectures such as global and shared memory, memory coalescing and bank conflicts. We can therefore evaluate asymptotic behavior of GPU algorithms more accurately than known models and we can develop algorithms that are efficient on many real architectures. As a showcase, we first analyze known comparison-based sorting algorithms using the AGPU model and show that they are not I/O optimal, that is, the number of global memory accesses is more than necessary. Then we propose a new algorithm which uses an asymptotically optimal number of global memory accesses and whose time complexity is also nearly optimal.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"59 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"A Novel Computational Model for GPUs with Application to I/O Optimal Sorting Algorithms\",\"authors\":\"A. Koike, K. Sadakane\",\"doi\":\"10.1109/IPDPSW.2014.72\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We propose a novel computational model for GPU. Known parallel computational models such as the PRAM model are not appropriate for evaluating GPU algorithms. Our model, called AGPU, abstracts the essence of current GPU architectures such as global and shared memory, memory coalescing and bank conflicts. We can therefore evaluate asymptotic behavior of GPU algorithms more accurately than known models and we can develop algorithms that are efficient on many real architectures. As a showcase, we first analyze known comparison-based sorting algorithms using the AGPU model and show that they are not I/O optimal, that is, the number of global memory accesses is more than necessary. Then we propose a new algorithm which uses an asymptotically optimal number of global memory accesses and whose time complexity is also nearly optimal.\",\"PeriodicalId\":153864,\"journal\":{\"name\":\"2014 IEEE International Parallel & Distributed Processing Symposium Workshops\",\"volume\":\"59 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Parallel & Distributed Processing Symposium Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2014.72\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2014.72","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

我们提出了一种新的GPU计算模型。已知的并行计算模型，如PRAM模型，不适合评估GPU算法。我们的模型，称为AGPU，抽象了当前GPU架构的本质，如全局和共享内存，内存合并和银行冲突。因此，我们可以比已知模型更准确地评估GPU算法的渐近行为，并且我们可以开发在许多实际架构上有效的算法。作为演示，我们首先使用AGPU模型分析已知的基于比较的排序算法，并表明它们不是I/O最优的，也就是说，全局内存访问的数量超过了必要的数量。然后，我们提出了一种新的算法，该算法使用渐近最优的全局存储器访问数，其时间复杂度也接近最优。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

A Novel Computational Model for GPUs with Application to I/O Optimal Sorting Algorithms

We propose a novel computational model for GPU. Known parallel computational models such as the PRAM model are not appropriate for evaluating GPU algorithms. Our model, called AGPU, abstracts the essence of current GPU architectures such as global and shared memory, memory coalescing and bank conflicts. We can therefore evaluate asymptotic behavior of GPU algorithms more accurately than known models and we can develop algorithms that are efficient on many real architectures. As a showcase, we first analyze known comparison-based sorting algorithms using the AGPU model and show that they are not I/O optimal, that is, the number of global memory accesses is more than necessary. Then we propose a new algorithm which uses an asymptotically optimal number of global memory accesses and whose time complexity is also nearly optimal.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 IEEE International Parallel & Distributed Processing Symposium Workshops

自引率

0.00%

发文量