Application Specific Approximate Behavioral Processor

IF 3 3区 计算机科学 Q2 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE
Qilin Si;Prattay Chowdhury;Rohit Sreekumar;Benjamin Carrion Schafer
{"title":"Application Specific Approximate Behavioral Processor","authors":"Qilin Si;Prattay Chowdhury;Rohit Sreekumar;Benjamin Carrion Schafer","doi":"10.1109/TSUSC.2022.3222117","DOIUrl":null,"url":null,"abstract":"Many applications require simple controllers that continuously run the same application. These applications are often found in battery operated embedded systems that require to be ultra-low power (ULP) and are very price sensitive. Some examples include IoT devices of different nature and medical devices. Currently, these systems rely on off-the-shelf general-purpose microprocessors. One of the problems of using these processors, is that not all of the resources are needed for a specific application. Furthermore, because of the regularity of the workloads running on these systems there is a large opportunity to optimize the processor by pruning those unused resources to achieve lower area (cost) and power. Moreover, these processors can be specified at the behavioral level and use High-Level Synthesis (HLS) to generate an efficient Register Transfer Level (RTL) description. This opens a window to additional optimizations as the processor implementation is fully re-optimized during the HLS process. Also, many applications running on these embedded systems tolerate imprecise outputs. These include image processing and digital signal processing (DSP) applications. This opens the door to further optimizations in the context of approximate computing. To address these issues, this work presents a methodology to customize a behavioral RISC processor automatically for a given workload such that its area and power are significantly reduced as compared to the original, general-purpose processor. First, generating a bespoke processor that leads to the exact output as compared to the original general-purpose one and then by approximating it allowing a certain level of error at the output. Compared to previous work that customizes a given processor at the gate netlist only, our proposed method shows significant benefits. In particular, this work shows that raising the level of abstraction reduces the area and power by 78.3% and 70.1% for the exact solution on average, and further reduces the area by an additional 10.0% and 16.5% for the approximate version tolerating a maximum of 10% and 20% output errors respectively.","PeriodicalId":13268,"journal":{"name":"IEEE Transactions on Sustainable Computing","volume":"8 2","pages":"165-179"},"PeriodicalIF":3.0000,"publicationDate":"2022-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Sustainable Computing","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/9950345/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
引用次数: 0

Abstract

Many applications require simple controllers that continuously run the same application. These applications are often found in battery operated embedded systems that require to be ultra-low power (ULP) and are very price sensitive. Some examples include IoT devices of different nature and medical devices. Currently, these systems rely on off-the-shelf general-purpose microprocessors. One of the problems of using these processors, is that not all of the resources are needed for a specific application. Furthermore, because of the regularity of the workloads running on these systems there is a large opportunity to optimize the processor by pruning those unused resources to achieve lower area (cost) and power. Moreover, these processors can be specified at the behavioral level and use High-Level Synthesis (HLS) to generate an efficient Register Transfer Level (RTL) description. This opens a window to additional optimizations as the processor implementation is fully re-optimized during the HLS process. Also, many applications running on these embedded systems tolerate imprecise outputs. These include image processing and digital signal processing (DSP) applications. This opens the door to further optimizations in the context of approximate computing. To address these issues, this work presents a methodology to customize a behavioral RISC processor automatically for a given workload such that its area and power are significantly reduced as compared to the original, general-purpose processor. First, generating a bespoke processor that leads to the exact output as compared to the original general-purpose one and then by approximating it allowing a certain level of error at the output. Compared to previous work that customizes a given processor at the gate netlist only, our proposed method shows significant benefits. In particular, this work shows that raising the level of abstraction reduces the area and power by 78.3% and 70.1% for the exact solution on average, and further reduces the area by an additional 10.0% and 16.5% for the approximate version tolerating a maximum of 10% and 20% output errors respectively.
特定于应用程序的近似行为处理器
许多应用程序需要连续运行同一应用程序的简单控制器。这些应用通常存在于需要超低功率(ULP)并且对价格非常敏感的电池操作嵌入式系统中。一些例子包括不同性质的物联网设备和医疗设备。目前,这些系统依赖于现成的通用微处理器。使用这些处理器的问题之一是,并非所有资源都是特定应用程序所需的。此外,由于这些系统上运行的工作负载的规律性,因此有很大的机会通过修剪那些未使用的资源来优化处理器,以实现较低的面积(成本)和功率。此外,这些处理器可以在行为级别上指定,并使用高级综合(HLS)来生成有效的寄存器传输级别(RTL)描述。这为额外的优化打开了一个窗口,因为处理器实现在HLS过程中被完全重新优化。此外,在这些嵌入式系统上运行的许多应用程序都允许不精确的输出。其中包括图像处理和数字信号处理(DSP)应用。这为近似计算的进一步优化打开了大门。为了解决这些问题,这项工作提出了一种方法,可以针对给定的工作负载自动定制行为RISC处理器,使其面积和功率与原始的通用处理器相比显著减少。首先,生成一个定制的处理器,与原始的通用处理器相比,该处理器可以获得精确的输出,然后通过对其进行近似,允许输出出现一定程度的误差。与之前仅在栅极网表处定制给定处理器的工作相比,我们提出的方法显示出显著的优势。特别是,这项工作表明,对于精确的解决方案,提高抽象级别平均将面积和功率分别减少78.3%和70.1%,对于最大允许10%和20%输出误差的近似版本,进一步将面积分别减少10.0%和16.5%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
IEEE Transactions on Sustainable Computing
IEEE Transactions on Sustainable Computing Mathematics-Control and Optimization
CiteScore
7.70
自引率
2.60%
发文量
54
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信