xml - gpu:用于图形计算的PRAM架构

2008 37th International Conference on Parallel Processing Pub Date : 2008-09-09 DOI:10.1109/ICPP.2008.35

Thomas M. DuBois, Bryant C. Lee, Yi Wang, M. Olano, U. Vishkin

{"title":"xml - gpu:用于图形计算的PRAM架构","authors":"Thomas M. DuBois, Bryant C. Lee, Yi Wang, M. Olano, U. Vishkin","doi":"10.1109/ICPP.2008.35","DOIUrl":null,"url":null,"abstract":"The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing these processors with a fully general-purpose parallel processor, without the fixed-function graphics hardware legacy of current graphics processing units (GPUs). The representative general-purpose processor we test against is XMT (for explicit multi-threading), a PRAM-like single-chip parallel architecture. Performance is compared for two characteristic shaders running in a fragment-limited GPU benchmark harness and on a cycle-accurate XMT simulator. The general-purpose processor is found to be significantly faster at a compute-only shader, but slower on a memory bound texture shader. Finally we analyze the design tradeoffs that would allow combining the best of both worlds: (i) a competitive XMT texture shader, with (ii) a general-purpose easy-to-program XMT many-core approach that scales up or down to the amount of parallelism provided by the application and is even compatible with serial code.","PeriodicalId":388408,"journal":{"name":"2008 37th International Conference on Parallel Processing","volume":"116 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2008-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"XMT-GPU: A PRAM Architecture for Graphics Computation\",\"authors\":\"Thomas M. DuBois, Bryant C. Lee, Yi Wang, M. Olano, U. Vishkin\",\"doi\":\"10.1109/ICPP.2008.35\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing these processors with a fully general-purpose parallel processor, without the fixed-function graphics hardware legacy of current graphics processing units (GPUs). The representative general-purpose processor we test against is XMT (for explicit multi-threading), a PRAM-like single-chip parallel architecture. Performance is compared for two characteristic shaders running in a fragment-limited GPU benchmark harness and on a cycle-accurate XMT simulator. The general-purpose processor is found to be significantly faster at a compute-only shader, but slower on a memory bound texture shader. Finally we analyze the design tradeoffs that would allow combining the best of both worlds: (i) a competitive XMT texture shader, with (ii) a general-purpose easy-to-program XMT many-core approach that scales up or down to the amount of parallelism provided by the application and is even compatible with serial code.\",\"PeriodicalId\":388408,\"journal\":{\"name\":\"2008 37th International Conference on Parallel Processing\",\"volume\":\"116 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2008-09-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2008 37th International Conference on Parallel Processing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICPP.2008.35\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2008 37th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICPP.2008.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

摘要

图形硬件中的着色处理器正变得越来越通用。通过模拟和基准测试，我们测试了将这些处理器替换为完全通用的并行处理器，而不使用当前图形处理单元(gpu)的固定功能图形硬件的潜在性能影响。我们测试的典型通用处理器是XMT(用于显式多线程)，这是一种类似于ram的单芯片并行架构。性能比较了两个特征着色器在片段有限的GPU基准测试和周期精确的XMT模拟器上运行。通用处理器在仅计算的着色器上明显更快，但在内存受限的纹理着色器上速度较慢。最后，我们分析了设计权衡，将允许结合两个世界的最好的:(i)一个有竞争力的XMT纹理着色器，(ii)一个通用的易于编程的XMT多核方法，可以根据应用程序提供的并行性进行伸缩，甚至与串行代码兼容。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

XMT-GPU: A PRAM Architecture for Graphics Computation

The shading processors in graphics hardware are becoming increasingly general-purpose. We test, through simulation and benchmarking, the potential performance impact of replacing these processors with a fully general-purpose parallel processor, without the fixed-function graphics hardware legacy of current graphics processing units (GPUs). The representative general-purpose processor we test against is XMT (for explicit multi-threading), a PRAM-like single-chip parallel architecture. Performance is compared for two characteristic shaders running in a fragment-limited GPU benchmark harness and on a cycle-accurate XMT simulator. The general-purpose processor is found to be significantly faster at a compute-only shader, but slower on a memory bound texture shader. Finally we analyze the design tradeoffs that would allow combining the best of both worlds: (i) a competitive XMT texture shader, with (ii) a general-purpose easy-to-program XMT many-core approach that scales up or down to the amount of parallelism provided by the application and is even compatible with serial code.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2008 37th International Conference on Parallel Processing

自引率

0.00%

发文量