基于快速算法的统一正逆整数变换硬件共享结构设计

2020 2nd International Conference on Computer Communication and the Internet (ICCCI) Pub Date : 2020-06-01 DOI:10.1109/ICCCI49374.2020.9145980

Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan

{"title":"基于快速算法的统一正逆整数变换硬件共享结构设计","authors":"Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan","doi":"10.1109/ICCCI49374.2020.9145980","DOIUrl":null,"url":null,"abstract":"In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.","PeriodicalId":153290,"journal":{"name":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Unified Forward and Inverse Integer Transforms Design with Fast Algorithm Based Hardware Sharing Architecture\",\"authors\":\"Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan\",\"doi\":\"10.1109/ICCCI49374.2020.9145980\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.\",\"PeriodicalId\":153290,\"journal\":{\"name\":\"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCI49374.2020.9145980\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI49374.2020.9145980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

本文的研究旨在通过硬件共享技术对多个4×4和8×8整数离散余弦变换进行多标准视频编码应用，从而降低硬件成本。提出的硬件共享架构支持H.264/AVC、AVS、VC-1、MPEG-1/2/4、HEVC和VP8视频编码标准的4×4和8×8两种大小转换。该设计支持基于硬件共享的多重正反变换。首先将变换矩阵替换为已知的矩阵表达式，然后将整个变换矩阵分解为多个稀疏矩阵。从而减少了计算复杂度、芯片面积和计算时间。通过本文提出的矩阵分解算法，将稀疏变换矩阵进一步分解为小矩阵，以方便和高效地实现硬件共享。与没有共享的单独实现相比，提出的基于一维硬件共享的多重正反变换设计减少了83.5%的加法和60.8%的移位操作。基于硬件共享的一维正逆变换设计栅极数为22.2K。所提出的基于硬件共享的二维变换需要573k个门。工作频率为110.8MHz，满足全高清(1920x 1080@60Hz)规格，最高工作频率可达200MH。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Unified Forward and Inverse Integer Transforms Design with Fast Algorithm Based Hardware Sharing Architecture

In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 2nd International Conference on Computer Communication and the Internet (ICCCI)

自引率

0.00%

发文量