通过增量压缩在gpgpu上构建基于STT-RAM的节能寄存器文件

Hang Zhang, Xuhao Chen, Nong Xiao, Fang Liu
{"title":"通过增量压缩在gpgpu上构建基于STT-RAM的节能寄存器文件","authors":"Hang Zhang, Xuhao Chen, Nong Xiao, Fang Liu","doi":"10.1145/2897937.2897989","DOIUrl":null,"url":null,"abstract":"To facilitate efficient context switches, GPUs usually employ a large-capacity register file to accommodate a massive amount of context information. However, the large register file introduces high power consumption, owing to high leakage power SRAM cells. Emerging non-volatile STT-RAM memory has recently been studied as a potential replacement to alleviate the leakage challenge when constructing register files on GPUs. Unfortunately, due to the long write latency and high energy consumption associated with write operations in STT-RAM, simply replacing SRAM with STT-RAM for register files would incur non-trivial performance overhead and only bring marginal energy benefits. In this paper, we propose to optimize STT-RAM based GPU register files for better energy-efficiency and performance via two techniques. First, we employ a light-weight compression framework with awareness of register value similarity. It is coupled with a group-based write driver control to mitigate the high energy overhead caused by STT-RAM writes. Second, to address the long write latency overhead of STT-RAM, we propose a centralized SRAM-based write buffer design to efficiently absorb STT-RAM writes with better buffer utilization, rather than the conventional design with distributed per-bank based write buffers. The experimental results show that our STT-RAM based register file design consumes only 37.4% energy over the SRAM baseline, while incurring only negligible performance degradation.","PeriodicalId":185271,"journal":{"name":"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":"{\"title\":\"Architecting energy-efficient STT-RAM based register file on GPGPUs via delta compression\",\"authors\":\"Hang Zhang, Xuhao Chen, Nong Xiao, Fang Liu\",\"doi\":\"10.1145/2897937.2897989\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"To facilitate efficient context switches, GPUs usually employ a large-capacity register file to accommodate a massive amount of context information. However, the large register file introduces high power consumption, owing to high leakage power SRAM cells. Emerging non-volatile STT-RAM memory has recently been studied as a potential replacement to alleviate the leakage challenge when constructing register files on GPUs. Unfortunately, due to the long write latency and high energy consumption associated with write operations in STT-RAM, simply replacing SRAM with STT-RAM for register files would incur non-trivial performance overhead and only bring marginal energy benefits. In this paper, we propose to optimize STT-RAM based GPU register files for better energy-efficiency and performance via two techniques. First, we employ a light-weight compression framework with awareness of register value similarity. It is coupled with a group-based write driver control to mitigate the high energy overhead caused by STT-RAM writes. Second, to address the long write latency overhead of STT-RAM, we propose a centralized SRAM-based write buffer design to efficiently absorb STT-RAM writes with better buffer utilization, rather than the conventional design with distributed per-bank based write buffers. The experimental results show that our STT-RAM based register file design consumes only 37.4% energy over the SRAM baseline, while incurring only negligible performance degradation.\",\"PeriodicalId\":185271,\"journal\":{\"name\":\"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)\",\"volume\":\"41 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-06-05\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"23\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2897937.2897989\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 53nd ACM/EDAC/IEEE Design Automation Conference (DAC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2897937.2897989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23

摘要

为了方便高效的上下文切换,gpu通常使用一个大容量的寄存器文件来容纳大量的上下文信息。然而,由于SRAM单元的高泄漏功率,大寄存器文件引入了高功耗。新兴的非易失性STT-RAM存储器最近被研究作为一种潜在的替代品,以减轻在gpu上构建寄存器文件时的泄漏挑战。不幸的是,由于STT-RAM中写操作的长写延迟和高能耗,简单地将SRAM替换为STT-RAM来处理寄存器文件将会带来不小的性能开销,并且只能带来边际的能源效益。在本文中,我们建议通过两种技术来优化基于STT-RAM的GPU寄存器文件,以获得更好的能效和性能。首先,我们采用了一个轻量级的压缩框架,该框架具有寄存器值相似性意识。它与基于组的写驱动程序控制相结合,以减轻由STT-RAM写引起的高能量开销。其次,为了解决STT-RAM的长写延迟开销,我们提出了一种基于集中式sram的写缓冲区设计,以更好的缓冲区利用率有效地吸收STT-RAM写,而不是传统的基于每个银行的分布式写缓冲区设计。实验结果表明,基于STT-RAM的寄存器文件设计在SRAM基准上仅消耗37.4%的能量,而性能下降可以忽略不计。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Architecting energy-efficient STT-RAM based register file on GPGPUs via delta compression
To facilitate efficient context switches, GPUs usually employ a large-capacity register file to accommodate a massive amount of context information. However, the large register file introduces high power consumption, owing to high leakage power SRAM cells. Emerging non-volatile STT-RAM memory has recently been studied as a potential replacement to alleviate the leakage challenge when constructing register files on GPUs. Unfortunately, due to the long write latency and high energy consumption associated with write operations in STT-RAM, simply replacing SRAM with STT-RAM for register files would incur non-trivial performance overhead and only bring marginal energy benefits. In this paper, we propose to optimize STT-RAM based GPU register files for better energy-efficiency and performance via two techniques. First, we employ a light-weight compression framework with awareness of register value similarity. It is coupled with a group-based write driver control to mitigate the high energy overhead caused by STT-RAM writes. Second, to address the long write latency overhead of STT-RAM, we propose a centralized SRAM-based write buffer design to efficiently absorb STT-RAM writes with better buffer utilization, rather than the conventional design with distributed per-bank based write buffers. The experimental results show that our STT-RAM based register file design consumes only 37.4% energy over the SRAM baseline, while incurring only negligible performance degradation.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信