探索近似计算的超高清AV1 FME插值体系结构

Q4 Engineering
William Kolodziejski, R. Domanski, M. Porto, B. Zatt, L. Agostini
{"title":"探索近似计算的超高清AV1 FME插值体系结构","authors":"William Kolodziejski, R. Domanski, M. Porto, B. Zatt, L. Agostini","doi":"10.29292/jics.v17i2.558","DOIUrl":null,"url":null,"abstract":"Modern video encoders like the AOMedia Video 1 (AV1) implement several complex tools to allow the required high level of compression efficiency. The Fractional Motion Estimation (FME) is one of these complex tools, and AV1 FME defines 42 different interpolation filters. To handle such complexity, hardware acceleration using approximate computing has become an interesting alternative to be explored. This paper presents three optimized approximate architectures for the AV1 FME interpolation filters. The architectures reach real time interpolation for UHD 4K videos at 30 frames per second in a low cost, low power, and memory-efficient design. The architectures were synthesized for a 40nm TSMC standard-cells technology reaching power gains up to 83%, when compared to a precise architecture, and up to 20% when compared to a previously published approximated solution. The area gains were also expressive: up to 83% and 40%, respectively. The architectures also allow a memory bandwidth reduction of up to 59.5%, in comparison with the state-of-the-art solutions. The approximations implied small coding efficiency degradation of 0.54% and 1.25% in BD-BR. The presented architectures have the best results found in the literature when considering the trade-off among hardware cost, power dissipation, processing rate, memory bandwidth, and coding efficiency.","PeriodicalId":39974,"journal":{"name":"Journal of Integrated Circuits and Systems","volume":" ","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Ultra-High Definition AV1 FME Interpolation Architectures Exploring Approximate Computing\",\"authors\":\"William Kolodziejski, R. Domanski, M. Porto, B. Zatt, L. Agostini\",\"doi\":\"10.29292/jics.v17i2.558\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Modern video encoders like the AOMedia Video 1 (AV1) implement several complex tools to allow the required high level of compression efficiency. The Fractional Motion Estimation (FME) is one of these complex tools, and AV1 FME defines 42 different interpolation filters. To handle such complexity, hardware acceleration using approximate computing has become an interesting alternative to be explored. This paper presents three optimized approximate architectures for the AV1 FME interpolation filters. The architectures reach real time interpolation for UHD 4K videos at 30 frames per second in a low cost, low power, and memory-efficient design. The architectures were synthesized for a 40nm TSMC standard-cells technology reaching power gains up to 83%, when compared to a precise architecture, and up to 20% when compared to a previously published approximated solution. The area gains were also expressive: up to 83% and 40%, respectively. The architectures also allow a memory bandwidth reduction of up to 59.5%, in comparison with the state-of-the-art solutions. The approximations implied small coding efficiency degradation of 0.54% and 1.25% in BD-BR. The presented architectures have the best results found in the literature when considering the trade-off among hardware cost, power dissipation, processing rate, memory bandwidth, and coding efficiency.\",\"PeriodicalId\":39974,\"journal\":{\"name\":\"Journal of Integrated Circuits and Systems\",\"volume\":\" \",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Integrated Circuits and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.29292/jics.v17i2.558\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Engineering\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Integrated Circuits and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.29292/jics.v17i2.558","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Engineering","Score":null,"Total":0}
引用次数: 0

摘要

像amedia video 1 (AV1)这样的现代视频编码器实现了几个复杂的工具来实现所需的高压缩效率。分数运动估计(FME)是这些复杂的工具之一,AV1 FME定义了42种不同的插值滤波器。为了处理这种复杂性,使用近似计算的硬件加速已经成为一种值得探索的有趣替代方案。本文提出了AV1 FME插值滤波器的三种优化近似结构。该架构以低成本、低功耗和高效内存的设计实现了每秒30帧的UHD 4K视频实时插值。这些架构是为40nm台积电标准电池技术合成的,与精确架构相比,功率增益高达83%,与先前公布的近似解决方案相比,功率增益高达20%。面积的增长也很明显:分别高达83%和40%。与最先进的解决方案相比,该架构还允许内存带宽减少高达59.5%。近似结果表明,BD-BR编码效率下降幅度较小,分别为0.54%和1.25%。在考虑硬件成本、功耗、处理速率、内存带宽和编码效率之间的权衡时,所提出的体系结构具有文献中发现的最佳结果。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Ultra-High Definition AV1 FME Interpolation Architectures Exploring Approximate Computing
Modern video encoders like the AOMedia Video 1 (AV1) implement several complex tools to allow the required high level of compression efficiency. The Fractional Motion Estimation (FME) is one of these complex tools, and AV1 FME defines 42 different interpolation filters. To handle such complexity, hardware acceleration using approximate computing has become an interesting alternative to be explored. This paper presents three optimized approximate architectures for the AV1 FME interpolation filters. The architectures reach real time interpolation for UHD 4K videos at 30 frames per second in a low cost, low power, and memory-efficient design. The architectures were synthesized for a 40nm TSMC standard-cells technology reaching power gains up to 83%, when compared to a precise architecture, and up to 20% when compared to a previously published approximated solution. The area gains were also expressive: up to 83% and 40%, respectively. The architectures also allow a memory bandwidth reduction of up to 59.5%, in comparison with the state-of-the-art solutions. The approximations implied small coding efficiency degradation of 0.54% and 1.25% in BD-BR. The presented architectures have the best results found in the literature when considering the trade-off among hardware cost, power dissipation, processing rate, memory bandwidth, and coding efficiency.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Journal of Integrated Circuits and Systems
Journal of Integrated Circuits and Systems Engineering-Electrical and Electronic Engineering
CiteScore
0.90
自引率
0.00%
发文量
39
期刊介绍: This journal will present state-of-art papers on Integrated Circuits and Systems. It is an effort of both Brazilian Microelectronics Society - SBMicro and Brazilian Computer Society - SBC to create a new scientific journal covering Process and Materials, Device and Characterization, Design, Test and CAD of Integrated Circuits and Systems. The Journal of Integrated Circuits and Systems is published through Special Issues on subjects to be defined by the Editorial Board. Special issues will publish selected papers from both Brazilian Societies annual conferences, SBCCI - Symposium on Integrated Circuits and Systems and SBMicro - Symposium on Microelectronics Technology and Devices.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信