Ivan Ratković, Oscar Palomar, Milan Stanic, O. Unsal, A. Cristal, M. Valero
{"title":"A Fully Parameterizable Low Power Design of Vector Fused Multiply-Add Using Active Clock-Gating Techniques","authors":"Ivan Ratković, Oscar Palomar, Milan Stanic, O. Unsal, A. Cristal, M. Valero","doi":"10.1145/2934583.2934587","DOIUrl":null,"url":null,"abstract":"The need for power-efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a re-tailoring for the mobile market that they are entering now. Floating point fused multiply-add, being a power consuming functional unit, deserves special attention. Although clock-gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector fused multiply-add units (VFU). These techniques ensure power savings without jeopardizing the timing. Using vector masking and vector multi-lane-aware clock-gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector floating-point instructions. We perform this research in a fully parameterizable and automated fashion using various tools at both architectural and circuit levels.","PeriodicalId":142716,"journal":{"name":"Proceedings of the 2016 International Symposium on Low Power Electronics and Design","volume":"396 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-08-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2016 International Symposium on Low Power Electronics and Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2934583.2934587","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The need for power-efficiency is driving a rethink of design decisions in processor architectures. While vector processors succeeded in the high-performance market in the past, they need a re-tailoring for the mobile market that they are entering now. Floating point fused multiply-add, being a power consuming functional unit, deserves special attention. Although clock-gating is a well-known method to reduce switching power in synchronous designs, there are unexplored opportunities for its application to vector processors, especially when considering active operating mode. In this research, we comprehensively identify, propose, and evaluate the most suitable clock-gating techniques for vector fused multiply-add units (VFU). These techniques ensure power savings without jeopardizing the timing. Using vector masking and vector multi-lane-aware clock-gating, we report power reductions of up to 52%, assuming active VFU operating at the peak performance. Among other findings, we observe that vector instruction-based clock-gating techniques achieve power savings for all vector floating-point instructions. We perform this research in a fully parameterizable and automated fashion using various tools at both architectural and circuit levels.