GPU- cc:具有通信核心的可重构GPU架构

M-SCOPES Pub Date : 2013-06-19 DOI:10.1145/2463596.2486153

Gert-Jan van den Braak, H. Corporaal

{"title":"GPU- cc:具有通信核心的可重构GPU架构","authors":"Gert-Jan van den Braak, H. Corporaal","doi":"10.1145/2463596.2486153","DOIUrl":null,"url":null,"abstract":"GPUs have evolved to programmable, energy efficient compute accelerators for massively parallel applications. Still, compute power is lost in many applications because of cycles spent on data movement and control instead of computations on actual data. Additional cycles can be lost as well on pipeline stalls due to long latency operations.\n To improve performance and energy efficiency, we introduce GPU-CC: a reconfigurable GPU architecture with communicating cores. It is based on a contemporary GPU, which can still be used as such, but also has the ability to reorganize the cores of a GPU in a reconfigurable network. In GPU-CC data movement and control is implicit in the configuration of the communication network. Additionally each core executes a fixed instruction, reducing instruction decode count and increasing energy efficiency. We show a large performance potential for GPU-CC, e.g. 1.9x and 2.4x for a 3x3 and 5x5 convolution application. The hardware cost of GPU-CC is mainly determined by the buffers in the added network, which amounts to 12.4% of extra memory space.","PeriodicalId":344517,"journal":{"name":"M-SCOPES","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"GPU-CC: a reconfigurable GPU architecture with communicating cores\",\"authors\":\"Gert-Jan van den Braak, H. Corporaal\",\"doi\":\"10.1145/2463596.2486153\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"GPUs have evolved to programmable, energy efficient compute accelerators for massively parallel applications. Still, compute power is lost in many applications because of cycles spent on data movement and control instead of computations on actual data. Additional cycles can be lost as well on pipeline stalls due to long latency operations.\\n To improve performance and energy efficiency, we introduce GPU-CC: a reconfigurable GPU architecture with communicating cores. It is based on a contemporary GPU, which can still be used as such, but also has the ability to reorganize the cores of a GPU in a reconfigurable network. In GPU-CC data movement and control is implicit in the configuration of the communication network. Additionally each core executes a fixed instruction, reducing instruction decode count and increasing energy efficiency. We show a large performance potential for GPU-CC, e.g. 1.9x and 2.4x for a 3x3 and 5x5 convolution application. The hardware cost of GPU-CC is mainly determined by the buffers in the added network, which amounts to 12.4% of extra memory space.\",\"PeriodicalId\":344517,\"journal\":{\"name\":\"M-SCOPES\",\"volume\":\"26 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-06-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"M-SCOPES\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2463596.2486153\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"M-SCOPES","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2463596.2486153","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

gpu已经发展成为可编程的、节能的计算加速器，用于大规模并行应用。尽管如此，在许多应用程序中，由于将周期花在数据移动和控制上，而不是在实际数据上进行计算，因此会损失计算能力。由于长时间的延迟操作，额外的周期也可能在管道停机时丢失。为了提高性能和能源效率，我们引入了GPU- cc:一种可重构的GPU架构，带有通信核心。它是基于一个当代的GPU，仍然可以这样使用，但也有能力重组GPU的核心在一个可重构的网络。在GPU-CC中，数据移动和控制隐含在通信网络的配置中。此外，每个核心执行一个固定的指令，减少指令解码计数和提高能源效率。我们展示了GPU-CC的巨大性能潜力，例如对于3x3和5x5卷积应用程序的1.9x和2.4x。GPU-CC的硬件成本主要取决于增加的网络中的缓冲区，占额外内存空间的12.4%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

GPU-CC: a reconfigurable GPU architecture with communicating cores

GPUs have evolved to programmable, energy efficient compute accelerators for massively parallel applications. Still, compute power is lost in many applications because of cycles spent on data movement and control instead of computations on actual data. Additional cycles can be lost as well on pipeline stalls due to long latency operations. To improve performance and energy efficiency, we introduce GPU-CC: a reconfigurable GPU architecture with communicating cores. It is based on a contemporary GPU, which can still be used as such, but also has the ability to reorganize the cores of a GPU in a reconfigurable network. In GPU-CC data movement and control is implicit in the configuration of the communication network. Additionally each core executes a fixed instruction, reducing instruction decode count and increasing energy efficiency. We show a large performance potential for GPU-CC, e.g. 1.9x and 2.4x for a 3x3 and 5x5 convolution application. The hardware cost of GPU-CC is mainly determined by the buffers in the added network, which amounts to 12.4% of extra memory space.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

M-SCOPES

自引率

0.00%

发文量