{"title":"基于快速算法的统一正逆整数变换硬件共享结构设计","authors":"Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan","doi":"10.1109/ICCCI49374.2020.9145980","DOIUrl":null,"url":null,"abstract":"In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.","PeriodicalId":153290,"journal":{"name":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","volume":"87 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Unified Forward and Inverse Integer Transforms Design with Fast Algorithm Based Hardware Sharing Architecture\",\"authors\":\"Chia-Wei Chang, Hao-Fan Hsu, Chih-Peng Fan\",\"doi\":\"10.1109/ICCCI49374.2020.9145980\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.\",\"PeriodicalId\":153290,\"journal\":{\"name\":\"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)\",\"volume\":\"87 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-06-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICCCI49374.2020.9145980\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 2nd International Conference on Computer Communication and the Internet (ICCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCCI49374.2020.9145980","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Unified Forward and Inverse Integer Transforms Design with Fast Algorithm Based Hardware Sharing Architecture
In this paper, the study aims to reduce the hardware cost by hardware sharing techniques to multiple 4×4 and 8×8 integer discrete cosine transforms for multiple-standard video coding applications. The proposed hardware sharing architecture supports two sizes transforms, i.e. 4×4 and 8×8, for H.264/AVC, AVS, VC-1, MPEG-1/2/4, HEVC, and VP8 video coding standards. The proposed design supports the hardware sharing based multiple forward and inverse transforms. Firstly, the transform matrices are replaced with the well-known matrix expressions, and then the entire transform matrices are decomposed to several sparse matrices. Thus, the computational complexity, the chip area, and the computational time are reduced. By the proposed matrix decomposition algorithm, the sparse transform matrices are decomposed to small ones further for ease and efficient hardware shares. Compared with the individual implementation without shares, the proposed 1-D hardware sharing based multiple forward and inverse transform design reduces additions by 83.5% and shift operations by 60.8%. The gate counts of the hardware sharing based 1-D forward and inverse transform design are 22.2K. The proposed hardware sharing based 2-D transform requires 57.3K gates. The operational frequency is 110.8MHz to satisfy the Full HD (1920×1080@60Hz) specification, and the maximum operational frequency can be up to 200MH.