CoreVA:一种可配置的资源高效VLIW处理器架构

Boris Hübener, Gregor Sievers, T. Jungeblut, Mario Porrmann, U. Rückert
{"title":"CoreVA:一种可配置的资源高效VLIW处理器架构","authors":"Boris Hübener, Gregor Sievers, T. Jungeblut, Mario Porrmann, U. Rückert","doi":"10.1109/EUC.2014.11","DOIUrl":null,"url":null,"abstract":"Mobile signal processing applications have a limited energy budget and require resource-efficient processing elements. General purpose VLIW CPUs offer a high energy efficiency and allow for the execution of a wide range of applications in this domain. In this work we present the configurable 32 bit VLIW processor architecture CoreVA. Besides the number of issue slots, it allows for a fine-grained configuration of the amount and characteristics of the processor's functional units (e.g., ALUs, MACs, or LD/ST units). A design-space exploration is performed to evaluate how these functional units impact area and power consumption. The basic configuration with one ALU, MAC, DIV, and LD/ST unit has a power consumption of 11.796 mW and an area of 0.142 mm2 at a clock frequency of 750 MHz in a 28 nm FD-SOI process. The maximum clock frequency in this process node is 833 MHz. To bear a relation of the hardware requirements to possible performance gains of the application, a signal processing algorithm is used as a benchmark to evaluate the energy consumption of different hardware configurations. The lowest energy consumption is observed with a configuration of 4 issue slots using 4 ALUs, 4 MACs, and 2 LD/ST units. This is an improvement by a factor of 1.68 compared to the single issue slot configuration.","PeriodicalId":331736,"journal":{"name":"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-08-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"19","resultStr":"{\"title\":\"CoreVA: A Configurable Resource-Efficient VLIW Processor Architecture\",\"authors\":\"Boris Hübener, Gregor Sievers, T. Jungeblut, Mario Porrmann, U. Rückert\",\"doi\":\"10.1109/EUC.2014.11\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Mobile signal processing applications have a limited energy budget and require resource-efficient processing elements. General purpose VLIW CPUs offer a high energy efficiency and allow for the execution of a wide range of applications in this domain. In this work we present the configurable 32 bit VLIW processor architecture CoreVA. Besides the number of issue slots, it allows for a fine-grained configuration of the amount and characteristics of the processor's functional units (e.g., ALUs, MACs, or LD/ST units). A design-space exploration is performed to evaluate how these functional units impact area and power consumption. The basic configuration with one ALU, MAC, DIV, and LD/ST unit has a power consumption of 11.796 mW and an area of 0.142 mm2 at a clock frequency of 750 MHz in a 28 nm FD-SOI process. The maximum clock frequency in this process node is 833 MHz. To bear a relation of the hardware requirements to possible performance gains of the application, a signal processing algorithm is used as a benchmark to evaluate the energy consumption of different hardware configurations. The lowest energy consumption is observed with a configuration of 4 issue slots using 4 ALUs, 4 MACs, and 2 LD/ST units. This is an improvement by a factor of 1.68 compared to the single issue slot configuration.\",\"PeriodicalId\":331736,\"journal\":{\"name\":\"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-08-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"19\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/EUC.2014.11\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 12th IEEE International Conference on Embedded and Ubiquitous Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EUC.2014.11","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 19

摘要

移动信号处理应用的能源预算有限,需要资源高效的处理元素。通用VLIW cpu提供高能效,并允许在该领域执行广泛的应用程序。在这项工作中,我们提出了可配置的32位VLIW处理器架构CoreVA。除了问题插槽的数量之外,它还允许对处理器功能单元(例如,alu、mac或LD/ST单元)的数量和特征进行细粒度配置。进行设计空间探索,以评估这些功能单元如何影响面积和功耗。在28 nm FD-SOI工艺中,一个ALU、MAC、DIV和LD/ST单元的基本配置功耗为11.796 mW,时钟频率为750mhz,面积为0.142 mm2。进程节点时钟频率上限为833mhz。为了了解硬件需求与应用程序可能获得的性能增益之间的关系,使用信号处理算法作为基准来评估不同硬件配置的能耗。当配置4个问题插槽,使用4个alu、4个mac和2个LD/ST单元时,能耗最低。与单问题插槽配置相比,这是1.68倍的改进。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
CoreVA: A Configurable Resource-Efficient VLIW Processor Architecture
Mobile signal processing applications have a limited energy budget and require resource-efficient processing elements. General purpose VLIW CPUs offer a high energy efficiency and allow for the execution of a wide range of applications in this domain. In this work we present the configurable 32 bit VLIW processor architecture CoreVA. Besides the number of issue slots, it allows for a fine-grained configuration of the amount and characteristics of the processor's functional units (e.g., ALUs, MACs, or LD/ST units). A design-space exploration is performed to evaluate how these functional units impact area and power consumption. The basic configuration with one ALU, MAC, DIV, and LD/ST unit has a power consumption of 11.796 mW and an area of 0.142 mm2 at a clock frequency of 750 MHz in a 28 nm FD-SOI process. The maximum clock frequency in this process node is 833 MHz. To bear a relation of the hardware requirements to possible performance gains of the application, a signal processing algorithm is used as a benchmark to evaluate the energy consumption of different hardware configurations. The lowest energy consumption is observed with a configuration of 4 issue slots using 4 ALUs, 4 MACs, and 2 LD/ST units. This is an improvement by a factor of 1.68 compared to the single issue slot configuration.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信