Liangliang Chang, Zhenyu Liu, Xiangyang Ji, Dongsheng Wang
{"title":"基于编码敏感的高效节能VBS-DCT VLSI设计算法","authors":"Liangliang Chang, Zhenyu Liu, Xiangyang Ji, Dongsheng Wang","doi":"10.1109/ICIP.2017.8296833","DOIUrl":null,"url":null,"abstract":"High Efficiency Video Coding (HEVC), emerging as the latest video coding standard, obtained a 50% bit-rate reduction while maintaining the competitive visual quality as H.264/AVC. Rate-Distortion Optimization (RDO) is a computation intensive module in HEVC encoding. In specific, during Intra coding, RDO accounts for 62% of the overall encoding time. The 2-dimensional DCT is the most area and power consuming component for VLSI implementation of RDO module. In this paper, we decompose the matrix multiplication of DCT into several sparse butterfly structures in series. In addition, the computation and the storage of 25% high frequency coefficients are dropped by our approximation algorithm. The proposed algorithms are integrated in HM15.0. It is verified that our methods could save 15.9% time with 1.03% BDBR augment. We further implement the DCT VLSI design using TSMC 90nm standard cell library. In worst conditions (125°C, 0.9V), the power dissipation of our DCT is 12.7mW at the 311MHz maximum clock speed. As compared to the primitive design, we achieved 71.9% of hardware and 70.2% of power reductions.","PeriodicalId":229602,"journal":{"name":"2017 IEEE International Conference on Image Processing (ICIP)","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Coding sensitive based approximation algorithm for power efficient VBS-DCT VLSI design in HEVC hardwired Intra encoder\",\"authors\":\"Liangliang Chang, Zhenyu Liu, Xiangyang Ji, Dongsheng Wang\",\"doi\":\"10.1109/ICIP.2017.8296833\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High Efficiency Video Coding (HEVC), emerging as the latest video coding standard, obtained a 50% bit-rate reduction while maintaining the competitive visual quality as H.264/AVC. Rate-Distortion Optimization (RDO) is a computation intensive module in HEVC encoding. In specific, during Intra coding, RDO accounts for 62% of the overall encoding time. The 2-dimensional DCT is the most area and power consuming component for VLSI implementation of RDO module. In this paper, we decompose the matrix multiplication of DCT into several sparse butterfly structures in series. In addition, the computation and the storage of 25% high frequency coefficients are dropped by our approximation algorithm. The proposed algorithms are integrated in HM15.0. It is verified that our methods could save 15.9% time with 1.03% BDBR augment. We further implement the DCT VLSI design using TSMC 90nm standard cell library. In worst conditions (125°C, 0.9V), the power dissipation of our DCT is 12.7mW at the 311MHz maximum clock speed. As compared to the primitive design, we achieved 71.9% of hardware and 70.2% of power reductions.\",\"PeriodicalId\":229602,\"journal\":{\"name\":\"2017 IEEE International Conference on Image Processing (ICIP)\",\"volume\":\"47 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE International Conference on Image Processing (ICIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIP.2017.8296833\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP.2017.8296833","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
摘要
高效率视频编码(High Efficiency Video Coding, HEVC)作为最新的视频编码标准,在保持H.264/AVC具有竞争力的视觉质量的同时,将比特率降低了50%。率失真优化(RDO)是HEVC编码中的一个计算密集型模块。具体而言,在Intra编码过程中,RDO占总编码时间的62%。二维DCT是RDO模块VLSI实现中面积和功耗最大的器件。本文将离散余弦变换的矩阵乘法分解成一系列的稀疏蝴蝶结构。此外,该近似算法还减少了25%高频系数的计算量和存储量。在HM15.0中集成了所提出的算法。实验结果表明,在BDBR增加1.03%的情况下,我们的方法可以节省15.9%的时间。我们进一步使用台积电90nm标准单元库实现DCT VLSI设计。在最恶劣的条件下(125°C, 0.9V),我们的DCT在311MHz最大时钟速度下的功耗为12.7mW。与原始设计相比,我们实现了71.9%的硬件和70.2%的功耗降低。
Coding sensitive based approximation algorithm for power efficient VBS-DCT VLSI design in HEVC hardwired Intra encoder
High Efficiency Video Coding (HEVC), emerging as the latest video coding standard, obtained a 50% bit-rate reduction while maintaining the competitive visual quality as H.264/AVC. Rate-Distortion Optimization (RDO) is a computation intensive module in HEVC encoding. In specific, during Intra coding, RDO accounts for 62% of the overall encoding time. The 2-dimensional DCT is the most area and power consuming component for VLSI implementation of RDO module. In this paper, we decompose the matrix multiplication of DCT into several sparse butterfly structures in series. In addition, the computation and the storage of 25% high frequency coefficients are dropped by our approximation algorithm. The proposed algorithms are integrated in HM15.0. It is verified that our methods could save 15.9% time with 1.03% BDBR augment. We further implement the DCT VLSI design using TSMC 90nm standard cell library. In worst conditions (125°C, 0.9V), the power dissipation of our DCT is 12.7mW at the 311MHz maximum clock speed. As compared to the primitive design, we achieved 71.9% of hardware and 70.2% of power reductions.