细粒度并行EBCOT和CUDA优化数字电影图像压缩

2012 IEEE International Conference on Multimedia and Expo Pub Date : 2012-07-09 DOI:10.1109/ICME.2012.115

Fang Wei, Qiu Cui, Ye Li

{"title":"细粒度并行EBCOT和CUDA优化数字电影图像压缩","authors":"Fang Wei, Qiu Cui, Ye Li","doi":"10.1109/ICME.2012.115","DOIUrl":null,"url":null,"abstract":"JPEG2000 has been accepted by The Society of Motion Picture and Television Engineers (SMPTE) as the image compression standard for the digital distribution of motion pictures. In JPEG2000, the biggest contribution to the coding performance comes from the Embedded Block Coding with Optimized Truncation (EBCOT), which is also the most time-consuming module by occupying almost 37% of the encoding time. There have been many research activities in the optimization of EBCOT on platforms like FPGA and VLSI, but on Graphics Processing Unit (GPU), a currently popular parallel computing platform in post-production of motion pictures, still few works have been done. This paper proposes a fine-granular parallel EBCOT by re-designing the highly serialized bit-plane coding to a parallel structure where the coding of all bits in a bit-plane could be performed in parallel, then the bit coding tasks can be distributed to the stream processors in GPU by taking advantage of the programming and memory model of CUDA. Experimental results show that our algorithms reveal 3 to 4 times computational speed improvement on an ordinary GPU compared to that on CPU.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression\",\"authors\":\"Fang Wei, Qiu Cui, Ye Li\",\"doi\":\"10.1109/ICME.2012.115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"JPEG2000 has been accepted by The Society of Motion Picture and Television Engineers (SMPTE) as the image compression standard for the digital distribution of motion pictures. In JPEG2000, the biggest contribution to the coding performance comes from the Embedded Block Coding with Optimized Truncation (EBCOT), which is also the most time-consuming module by occupying almost 37% of the encoding time. There have been many research activities in the optimization of EBCOT on platforms like FPGA and VLSI, but on Graphics Processing Unit (GPU), a currently popular parallel computing platform in post-production of motion pictures, still few works have been done. This paper proposes a fine-granular parallel EBCOT by re-designing the highly serialized bit-plane coding to a parallel structure where the coding of all bits in a bit-plane could be performed in parallel, then the bit coding tasks can be distributed to the stream processors in GPU by taking advantage of the programming and memory model of CUDA. Experimental results show that our algorithms reveal 3 to 4 times computational speed improvement on an ordinary GPU compared to that on CPU.\",\"PeriodicalId\":273567,\"journal\":{\"name\":\"2012 IEEE International Conference on Multimedia and Expo\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Multimedia and Expo\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2012.115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2012.115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 11

摘要

JPEG2000已被美国电影电视工程师协会(SMPTE)接受为电影数字发行的图像压缩标准。在JPEG2000中，对编码性能贡献最大的是优化截断嵌入式块编码(EBCOT)，这也是最耗时的模块，占用了近37%的编码时间。在FPGA、VLSI等平台上对EBCOT的优化已经有了很多研究，但在目前流行的电影后期并行计算平台GPU上做的工作还很少。本文提出了一种细粒度并行EBCOT，将高度序列化的位平面编码重新设计为并行结构，使位平面内所有位的编码可以并行进行，然后利用CUDA的编程和内存模型将比特编码任务分配给GPU中的流处理器。实验结果表明，我们的算法在普通GPU上的计算速度比在CPU上提高了3到4倍。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression

JPEG2000 has been accepted by The Society of Motion Picture and Television Engineers (SMPTE) as the image compression standard for the digital distribution of motion pictures. In JPEG2000, the biggest contribution to the coding performance comes from the Embedded Block Coding with Optimized Truncation (EBCOT), which is also the most time-consuming module by occupying almost 37% of the encoding time. There have been many research activities in the optimization of EBCOT on platforms like FPGA and VLSI, but on Graphics Processing Unit (GPU), a currently popular parallel computing platform in post-production of motion pictures, still few works have been done. This paper proposes a fine-granular parallel EBCOT by re-designing the highly serialized bit-plane coding to a parallel structure where the coding of all bits in a bit-plane could be performed in parallel, then the bit coding tasks can be distributed to the stream processors in GPU by taking advantage of the programming and memory model of CUDA. Experimental results show that our algorithms reveal 3 to 4 times computational speed improvement on an ordinary GPU compared to that on CPU.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2012 IEEE International Conference on Multimedia and Expo

自引率

0.00%

发文量