AMOEBA: a coarse grained reconfigurable architecture for dynamic GPU scaling

Xianwei Cheng, Hui Zhao, M. Kandemir, Beilei Jiang, Gayatri Mehta
{"title":"AMOEBA: a coarse grained reconfigurable architecture for dynamic GPU scaling","authors":"Xianwei Cheng, Hui Zhao, M. Kandemir, Beilei Jiang, Gayatri Mehta","doi":"10.1145/3392717.3392738","DOIUrl":null,"url":null,"abstract":"Different GPU applications exhibit varying scalability patterns with network-on-chip (NoC), coalescing, memory and control divergence, and L1 cache behavior. A GPU consists of several Streaming Multi-processors (SMs) that collectively determine how shared resources are partitioned and accessed. Recent years have seen divergent paths in SM scaling towards scale-up (fewer, larger SMs) vs. scale-out (more, smaller SMs). However, neither scaling up nor scaling out can meet the scalability requirement of all applications running on a given GPU system, which inevitably results in performance degradation and resource under-utilization for some applications. In this work, we investigate major design parameters that influence GPU scaling. We then propose AMOEBA, a solution to GPU scaling through reconfigurable SM cores. AMOEBA monitors and predicts application scalability at run-time and adjusts the SM configuration to meet program requirements. AMOEBA also enables dynamic creation of heterogeneous SMs through independent fusing or splitting. AMOEBA is a microarchitecture-based solution and requires no additional programming effort or custom compiler support. Our experimental evaluations with application programs from various benchmark suites indicate that AMOEBA is able to achieve a maximum performance gain of 4.3x, and generates an average performance improvement of 47% when considering all benchmarks tested.","PeriodicalId":346687,"journal":{"name":"Proceedings of the 34th ACM International Conference on Supercomputing","volume":"218 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 34th ACM International Conference on Supercomputing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3392717.3392738","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Different GPU applications exhibit varying scalability patterns with network-on-chip (NoC), coalescing, memory and control divergence, and L1 cache behavior. A GPU consists of several Streaming Multi-processors (SMs) that collectively determine how shared resources are partitioned and accessed. Recent years have seen divergent paths in SM scaling towards scale-up (fewer, larger SMs) vs. scale-out (more, smaller SMs). However, neither scaling up nor scaling out can meet the scalability requirement of all applications running on a given GPU system, which inevitably results in performance degradation and resource under-utilization for some applications. In this work, we investigate major design parameters that influence GPU scaling. We then propose AMOEBA, a solution to GPU scaling through reconfigurable SM cores. AMOEBA monitors and predicts application scalability at run-time and adjusts the SM configuration to meet program requirements. AMOEBA also enables dynamic creation of heterogeneous SMs through independent fusing or splitting. AMOEBA is a microarchitecture-based solution and requires no additional programming effort or custom compiler support. Our experimental evaluations with application programs from various benchmark suites indicate that AMOEBA is able to achieve a maximum performance gain of 4.3x, and generates an average performance improvement of 47% when considering all benchmarks tested.
AMOEBA:用于动态GPU扩展的粗粒度可重构架构
不同的GPU应用程序在片上网络(NoC)、合并、内存和控制发散以及L1缓存行为方面表现出不同的可伸缩性模式。GPU由几个流式多处理器(SMs)组成,它们共同决定如何对共享资源进行分区和访问。近年来,人们看到了短信扩展的不同路径,即向内扩展(更少、更大的短信)和向外扩展(更多、更小的短信)。然而,无论是向上扩展还是向外扩展都不能满足在给定GPU系统上运行的所有应用程序的可扩展性要求,这不可避免地会导致某些应用程序的性能下降和资源利用率不足。在这项工作中,我们研究了影响GPU缩放的主要设计参数。然后,我们提出了AMOEBA,一种通过可重构的SM内核来扩展GPU的解决方案。AMOEBA在运行时监视和预测应用程序的可伸缩性,并调整SM配置以满足程序需求。AMOEBA还支持通过独立融合或分裂动态创建异构SMs。AMOEBA是一种基于微体系结构的解决方案,不需要额外的编程工作或定制编译器支持。我们对各种基准测试套件的应用程序进行的实验评估表明,AMOEBA能够实现4.3倍的最大性能增益,并且在考虑所有测试的基准测试时,平均性能提高了47%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信