基于快速算法的统一正逆整数变换硬件共享结构设计

Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan
{"title":"基于快速算法的统一正逆整数变换硬件共享结构设计","authors":"Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan","doi":"10.1109/ICCCI49374.2020.9145980","DOIUrl":null,"url":null,"abstract":"In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.","PeriodicalId":153290,"journal":{"name":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Unified Forward and Inverse Integer Transforms Design with Fast Algorithm Based Hardware Sharing Architecture\",\"authors\":\"Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan\",\"doi\":\"10.1109/ICCCI49374.2020.9145980\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.\",\"PeriodicalId\":153290,\"journal\":{\"name\":\"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCI49374.2020.9145980\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI49374.2020.9145980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

本文的研究旨在通过硬件共享技术对多个4×4和8×8整数离散余弦变换进行多标准视频编码应用,从而降低硬件成本。提出的硬件共享架构支持H.264/AVC、AVS、VC-1、MPEG-1/2/4、HEVC和VP8视频编码标准的4×4和8×8两种大小转换。该设计支持基于硬件共享的多重正反变换。首先将变换矩阵替换为已知的矩阵表达式,然后将整个变换矩阵分解为多个稀疏矩阵。从而减少了计算复杂度、芯片面积和计算时间。通过本文提出的矩阵分解算法,将稀疏变换矩阵进一步分解为小矩阵,以方便和高效地实现硬件共享。与没有共享的单独实现相比,提出的基于一维硬件共享的多重正反变换设计减少了83.5%的加法和60.8%的移位操作。基于硬件共享的一维正逆变换设计栅极数为22.2K。所提出的基于硬件共享的二维变换需要573k个门。工作频率为110.8MHz,满足全高清(1920x 1080@60Hz)规格,最高工作频率可达200MH。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Unified Forward and Inverse Integer Transforms Design with Fast Algorithm Based Hardware Sharing Architecture
In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信