A Fully Parameterizable Low Power Design of Vector Fused Multiply-Add Using Active Clock-Gating Techniques

Ivan Ratković, Oscar Palomar, Milan Stanic, O. Unsal, A. Cristal, M. Valero
{"title":"A Fully Parameterizable Low Power Design of Vector Fused Multiply-Add Using Active Clock-Gating Techniques","authors":"Ivan Ratković, Oscar Palomar, Milan Stanic, O. Unsal, A. Cristal, M. Valero","doi":"10.1145/2934583.2934587","DOIUrl":null,"url":null,"abstract":"The need for power-efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a re-tailoring for the mobile market that they are entering now. Floating point fused multiply-add, being a power consuming functional unit, deserves special attention. Although clock-gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector fused multiply-add units (VFU). These techniques ensure power savings without jeopardizing the timing. Using vector masking and vector multi-lane-aware clock-gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector floating-point instructions. We perform this research in a fully parameterizable and automated fashion using various tools at both architectural and circuit levels.","PeriodicalId":142716,"journal":{"name":"Proceedings of the 2016 International Symposium on Low Power Electronics and Design","volume":"396 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Symposium on Low Power Electronics and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2934583.2934587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The need for power-efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a re-tailoring for the mobile market that they are entering now. Floating point fused multiply-add, being a power consuming functional unit, deserves special attention. Although clock-gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector fused multiply-add units (VFU). These techniques ensure power savings without jeopardizing the timing. Using vector masking and vector multi-lane-aware clock-gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector floating-point instructions. We perform this research in a fully parameterizable and automated fashion using various tools at both architectural and circuit levels.
基于有源时钟门控技术的矢量融合乘加全参数化低功耗设计
对能效的需求促使人们重新思考处理器架构中的设计决策。虽然矢量处理器过去在高性能市场取得了成功,但它们现在需要重新调整,以适应它们正在进入的移动市场。浮点融合乘加运算作为一种耗电的功能单元,值得特别关注。虽然时钟门控在同步设计中是一种众所周知的降低开关功率的方法,但它在矢量处理器中的应用仍有未开发的机会,特别是在考虑主动工作模式时。在这项研究中,我们全面地确定、提出并评估了向量融合乘加单元(VFU)最合适的时钟门控技术。这些技术确保在不影响时间的情况下节省电力。使用矢量掩蔽和矢量多通道感知时钟门控,我们报告了高达52%的功耗降低,假设有源VFU在峰值性能下工作。在其他发现中,我们观察到基于矢量指令的时钟门控技术可以为所有矢量浮点指令节省功耗。我们在架构和电路级别使用各种工具以完全可参数化和自动化的方式进行这项研究。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信