基于二进制-一元混合计算的fpga近似常系数乘法

S. R. Faraji, Pierre Abillama, K. Bazargan
{"title":"基于二进制-一元混合计算的fpga近似常系数乘法","authors":"S. R. Faraji, Pierre Abillama, K. Bazargan","doi":"10.1145/3494570","DOIUrl":null,"url":null,"abstract":"Multipliers are used in virtually all Digital Signal Processing (DSP) applications such as image and video processing. Multiplier efficiency has a direct impact on the overall performance of such applications, especially when real-time processing is needed, as in 4K video processing, or where hardware resources are limited, as in mobile and IoT devices. We propose a novel, low-cost, low energy, and high-speed approximate constant coefficient multiplier (CCM) using a hybrid binary-unary encoding method. The proposed method implements a CCM using simple routing networks with no logic gates in the unary domain, which results in more efficient multipliers compared to Xilinx LogiCORE IP CCMs and table-based KCM CCMs (Flopoco) on average. We evaluate the proposed multipliers on 2-D discrete cosine transform algorithm as a common DSP module. Post-routing FPGA results show that the proposed multipliers can improve the {area, area × delay, power consumption, and energy-delay product} of a 2-D discrete cosine transform on average by {30%, 33%, 30%, 31%}. Moreover, the throughput of the proposed 2-D discrete cosine transform is on average 5% more than that of the binary architecture implemented using table-based KCM CCMs. We will show that our method has fewer routability issues compared to binary implementations when implementing a DCT core.","PeriodicalId":162787,"journal":{"name":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","volume":"63 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Approximate Constant-Coefficient Multiplication Using Hybrid Binary-Unary Computing for FPGAs\",\"authors\":\"S. R. Faraji, Pierre Abillama, K. Bazargan\",\"doi\":\"10.1145/3494570\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Multipliers are used in virtually all Digital Signal Processing (DSP) applications such as image and video processing. Multiplier efficiency has a direct impact on the overall performance of such applications, especially when real-time processing is needed, as in 4K video processing, or where hardware resources are limited, as in mobile and IoT devices. We propose a novel, low-cost, low energy, and high-speed approximate constant coefficient multiplier (CCM) using a hybrid binary-unary encoding method. The proposed method implements a CCM using simple routing networks with no logic gates in the unary domain, which results in more efficient multipliers compared to Xilinx LogiCORE IP CCMs and table-based KCM CCMs (Flopoco) on average. We evaluate the proposed multipliers on 2-D discrete cosine transform algorithm as a common DSP module. Post-routing FPGA results show that the proposed multipliers can improve the {area, area × delay, power consumption, and energy-delay product} of a 2-D discrete cosine transform on average by {30%, 33%, 30%, 31%}. Moreover, the throughput of the proposed 2-D discrete cosine transform is on average 5% more than that of the binary architecture implemented using table-based KCM CCMs. We will show that our method has fewer routability issues compared to binary implementations when implementing a DCT core.\",\"PeriodicalId\":162787,\"journal\":{\"name\":\"ACM Transactions on Reconfigurable Technology and Systems (TRETS)\",\"volume\":\"63 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-12-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Transactions on Reconfigurable Technology and Systems (TRETS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3494570\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Transactions on Reconfigurable Technology and Systems (TRETS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3494570","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

摘要

乘法器几乎用于所有数字信号处理(DSP)应用,如图像和视频处理。乘数效率对此类应用的整体性能有直接影响,特别是在需要实时处理的情况下,如4K视频处理,或者在硬件资源有限的情况下,如移动和物联网设备。我们提出了一种新颖、低成本、低能量、高速的近似常系数乘法器(CCM),采用二元-一元混合编码方法。该方法使用简单的路由网络实现了一种CCM,在一元域中没有逻辑门,与Xilinx LogiCORE IP CCM和基于表的KCM CCM (Flopoco)相比,其平均效率更高。作为一种常见的DSP模块,我们对所提出的乘法器在二维离散余弦变换算法上进行了评估。路由后的FPGA结果表明,所提出的乘法器可以使二维离散余弦变换的{面积、面积×延迟、功耗和能量延迟积}平均提高{30%、33%、30%、31%}。此外,所提出的二维离散余弦变换的吞吐量比使用基于表的KCM CCMs实现的二进制架构平均高出5%。我们将展示在实现DCT核心时,与二进制实现相比,我们的方法具有更少的可达性问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Approximate Constant-Coefficient Multiplication Using Hybrid Binary-Unary Computing for FPGAs
Multipliers are used in virtually all Digital Signal Processing (DSP) applications such as image and video processing. Multiplier efficiency has a direct impact on the overall performance of such applications, especially when real-time processing is needed, as in 4K video processing, or where hardware resources are limited, as in mobile and IoT devices. We propose a novel, low-cost, low energy, and high-speed approximate constant coefficient multiplier (CCM) using a hybrid binary-unary encoding method. The proposed method implements a CCM using simple routing networks with no logic gates in the unary domain, which results in more efficient multipliers compared to Xilinx LogiCORE IP CCMs and table-based KCM CCMs (Flopoco) on average. We evaluate the proposed multipliers on 2-D discrete cosine transform algorithm as a common DSP module. Post-routing FPGA results show that the proposed multipliers can improve the {area, area × delay, power consumption, and energy-delay product} of a 2-D discrete cosine transform on average by {30%, 33%, 30%, 31%}. Moreover, the throughput of the proposed 2-D discrete cosine transform is on average 5% more than that of the binary architecture implemented using table-based KCM CCMs. We will show that our method has fewer routability issues compared to binary implementations when implementing a DCT core.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信