On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators

A. Pedram, A. Gerstlauer, R. V. D. Geijn
{"title":"On the Efficiency of Register File versus Broadcast Interconnect for Collective Communications in Data-Parallel Hardware Accelerators","authors":"A. Pedram, A. Gerstlauer, R. V. D. Geijn","doi":"10.1109/SBAC-PAD.2012.35","DOIUrl":null,"url":null,"abstract":"Reducing power consumption and increasing efficiency is a key concern for many applications. How to design highly efficient computing elements while maintaining enough flexibility within a domain of applications is a fundamental question. In this paper, we present how broadcast buses can eliminate the use of power hungry multi-ported register files in the context of data-parallel hardware accelerators for linear algebra operations. We demonstrate an algorithm/architecture co-design for the mapping of different collective communication operations, which are crucial for achieving performance and efficiency in most linear algebra routines, such as GEMM, SYRK and matrix transposition. We compare a broadcast bus based architecture with conventional SIMD, 2D-SIMD and flat register file for these operations in terms of area and energy efficiency. Results show that fast broadcast data movement abilities in a prototypical linear algebra core can achieve up to 75× better power and up to 10× better area efficiency compared to traditional SIMD architectures.","PeriodicalId":232444,"journal":{"name":"2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing","volume":"12 s2","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 24th International Symposium on Computer Architecture and High Performance Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBAC-PAD.2012.35","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12

Abstract

Reducing power consumption and increasing efficiency is a key concern for many applications. How to design highly efficient computing elements while maintaining enough flexibility within a domain of applications is a fundamental question. In this paper, we present how broadcast buses can eliminate the use of power hungry multi-ported register files in the context of data-parallel hardware accelerators for linear algebra operations. We demonstrate an algorithm/architecture co-design for the mapping of different collective communication operations, which are crucial for achieving performance and efficiency in most linear algebra routines, such as GEMM, SYRK and matrix transposition. We compare a broadcast bus based architecture with conventional SIMD, 2D-SIMD and flat register file for these operations in terms of area and energy efficiency. Results show that fast broadcast data movement abilities in a prototypical linear algebra core can achieve up to 75× better power and up to 10× better area efficiency compared to traditional SIMD architectures.
数据并行硬件加速器集体通信中寄存器文件与广播互连的效率研究
降低功耗和提高效率是许多应用的关键问题。如何设计高效的计算元素,同时在应用程序领域内保持足够的灵活性是一个基本问题。在本文中,我们介绍了广播总线如何在线性代数操作的数据并行硬件加速器环境中消除耗电多端口寄存器文件的使用。我们展示了一种算法/架构协同设计,用于映射不同的集体通信操作,这对于在大多数线性代数例程中实现性能和效率至关重要,例如GEMM, syk和矩阵转置。我们比较了基于广播总线的架构与传统SIMD、2D-SIMD和平面寄存器文件在这些操作方面的面积和能源效率。结果表明,与传统SIMD架构相比,原型线性代数核心的快速广播数据移动能力可以实现高达75倍的功率提升和高达10倍的面积效率提升。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信