PATS: Pattern aware scheduling and power gating for GPGPUs

2014 23rd International Conference on Parallel Architecture and Compilation (PACT) Pub Date : 2014-08-24 DOI:10.1145/2628071.2628105

Qiumin Xu, M. Annavaram

{"title":"PATS: Pattern aware scheduling and power gating for GPGPUs","authors":"Qiumin Xu, M. Annavaram","doi":"10.1145/2628071.2628105","DOIUrl":null,"url":null,"abstract":"General purpose computing using graphics processing units (GPGPUs) is an attractive option to achieve power efficient throughput computing. But the power efficiency of GPGPUs can be significantly curtailed in the presence of divergence. This paper evaluates two important facets of this problem. First, we study the branch divergence behavior of various GPGPU workloads. We show that only a few branch divergence patterns are dominant in most workloads. In fact only five branch divergence patterns account for 60% of all the divergent instructions in our workloads. In the second part of this work we exploit this branch divergence pattern bias to propose a new divergence pattern aware warp scheduler, called PATS. PATS prioritizes scheduling warps with the same divergence pattern so as to create long idleness windows for any given execution lane. The long idleness windows are then exploited for efficiently power gating the unused lanes while amortizing the gating overhead. We describe the architectural implementation details of PATS and evaluate the power and performance impact of PATS. Our proposed design significantly improves power gating efficiency of GPGPUs with minimal performance overhead.","PeriodicalId":263670,"journal":{"name":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"44","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 23rd International Conference on Parallel Architecture and Compilation (PACT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2628071.2628105","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 44

Abstract

General purpose computing using graphics processing units (GPGPUs) is an attractive option to achieve power efficient throughput computing. But the power efficiency of GPGPUs can be significantly curtailed in the presence of divergence. This paper evaluates two important facets of this problem. First, we study the branch divergence behavior of various GPGPU workloads. We show that only a few branch divergence patterns are dominant in most workloads. In fact only five branch divergence patterns account for 60% of all the divergent instructions in our workloads. In the second part of this work we exploit this branch divergence pattern bias to propose a new divergence pattern aware warp scheduler, called PATS. PATS prioritizes scheduling warps with the same divergence pattern so as to create long idleness windows for any given execution lane. The long idleness windows are then exploited for efficiently power gating the unused lanes while amortizing the gating overhead. We describe the architectural implementation details of PATS and evaluate the power and performance impact of PATS. Our proposed design significantly improves power gating efficiency of GPGPUs with minimal performance overhead.

查看原文本刊更多论文

PATS:用于gpgpu的模式感知调度和功率门控

使用图形处理单元(gpgpu)进行通用计算是实现节能吞吐量计算的一个有吸引力的选择。但是，在存在分歧的情况下，gpgpu的功率效率可能会大大降低。本文评估了这个问题的两个重要方面。首先，我们研究了不同GPGPU工作负载下的分支发散行为。我们表明，在大多数工作负载中，只有少数分支发散模式占主导地位。事实上，在我们的工作负载中，只有五种分支发散模式占所有发散指令的60%。在本工作的第二部分，我们利用这种分支发散模式偏差提出了一个新的发散模式感知的翘曲调度器，称为PATS。PATS对具有相同发散模式的调度扭曲进行优先级排序，以便为任何给定的执行通道创建长空闲窗口。然后利用长空闲窗口有效地对未使用的通道进行电源门控，同时分摊门控开销。我们描述了PATS的架构实现细节，并评估了PATS的功能和性能影响。我们提出的设计以最小的性能开销显著提高了gpgpu的功率门控效率。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2014 23rd International Conference on Parallel Architecture and Compilation (PACT)

自引率

0.00%

发文量