R. Conceição, J. Cláudio, Souza Jr, R. Jeske, M. Porto, J. Mattos, L. Agostini
{"title":"Hardware design for the 32×32 IDCT of the HEVC video coding standard","authors":"R. Conceição, J. Cláudio, Souza Jr, R. Jeske, M. Porto, J. Mattos, L. Agostini","doi":"10.1109/SBCCI.2013.6644881","DOIUrl":null,"url":null,"abstract":"This paper is focused in the inverse transforms defined in the video coding standard HEVC - High Efficiency Video Coding. The transforms stage is one of the innovations proposed by HEVC since it allows the use of the biggest number of transforms sizes (four) and also the biggest transform sizes (till 32×32) when compared with previous standards. The inverse DCT is performed by the video encoder and decoder as well. This paper presents an efficient hardware design for the 32×32 HEVC IDCT based on the separability principle. The hardware design was planned to reach real time processing (at least 30 frames per second) for high resolution videos, exploiting a high parallelism level (32 samples consumed per clock cycle). The architecture was also planned to reach a low latency and a low cost, then it was designed in a purely combinational way and using a multiplierless approach. The synthesis process was targeted to an Altera Stratix IV FPGA. The synthesis results show that the designed architecture is capable to process more than 30 QFHD frames (3840×2160 pixels) per second, with a latency of 33 clock cycles.","PeriodicalId":203604,"journal":{"name":"2013 26th Symposium on Integrated Circuits and Systems Design (SBCCI)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 26th Symposium on Integrated Circuits and Systems Design (SBCCI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SBCCI.2013.6644881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7
Abstract
This paper is focused in the inverse transforms defined in the video coding standard HEVC - High Efficiency Video Coding. The transforms stage is one of the innovations proposed by HEVC since it allows the use of the biggest number of transforms sizes (four) and also the biggest transform sizes (till 32×32) when compared with previous standards. The inverse DCT is performed by the video encoder and decoder as well. This paper presents an efficient hardware design for the 32×32 HEVC IDCT based on the separability principle. The hardware design was planned to reach real time processing (at least 30 frames per second) for high resolution videos, exploiting a high parallelism level (32 samples consumed per clock cycle). The architecture was also planned to reach a low latency and a low cost, then it was designed in a purely combinational way and using a multiplierless approach. The synthesis process was targeted to an Altera Stratix IV FPGA. The synthesis results show that the designed architecture is capable to process more than 30 QFHD frames (3840×2160 pixels) per second, with a latency of 33 clock cycles.
本文主要研究视频编码标准HEVC (High Efficiency video coding)中定义的逆变换。转换阶段是HEVC提出的创新之一,因为与以前的标准相比,它允许使用最大数量的转换尺寸(四个)和最大的转换尺寸(直到32×32)。反向DCT由视频编码器和解码器完成。本文提出了一种基于可分性原理的32×32 HEVC IDCT的高效硬件设计方法。硬件设计计划达到高分辨率视频的实时处理(至少每秒30帧),利用高并行性水平(每个时钟周期消耗32个样本)。该架构还计划达到低延迟和低成本,然后以纯粹的组合方式设计,并使用无乘法器方法。合成过程针对Altera Stratix IV FPGA。综合结果表明,所设计的架构能够每秒处理超过30个QFHD帧(3840×2160像素),延迟为33个时钟周期。