{"title":"细粒度并行EBCOT和CUDA优化数字电影图像压缩","authors":"Fang Wei, Qiu Cui, Ye Li","doi":"10.1109/ICME.2012.115","DOIUrl":null,"url":null,"abstract":"JPEG2000 has been accepted by The Society of Motion Picture and Television Engineers (SMPTE) as the image compression standard for the digital distribution of motion pictures. In JPEG2000, the biggest contribution to the coding performance comes from the Embedded Block Coding with Optimized Truncation (EBCOT), which is also the most time-consuming module by occupying almost 37% of the encoding time. There have been many research activities in the optimization of EBCOT on platforms like FPGA and VLSI, but on Graphics Processing Unit (GPU), a currently popular parallel computing platform in post-production of motion pictures, still few works have been done. This paper proposes a fine-granular parallel EBCOT by re-designing the highly serialized bit-plane coding to a parallel structure where the coding of all bits in a bit-plane could be performed in parallel, then the bit coding tasks can be distributed to the stream processors in GPU by taking advantage of the programming and memory model of CUDA. Experimental results show that our algorithms reveal 3 to 4 times computational speed improvement on an ordinary GPU compared to that on CPU.","PeriodicalId":273567,"journal":{"name":"2012 IEEE International Conference on Multimedia and Expo","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression\",\"authors\":\"Fang Wei, Qiu Cui, Ye Li\",\"doi\":\"10.1109/ICME.2012.115\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"JPEG2000 has been accepted by The Society of Motion Picture and Television Engineers (SMPTE) as the image compression standard for the digital distribution of motion pictures. In JPEG2000, the biggest contribution to the coding performance comes from the Embedded Block Coding with Optimized Truncation (EBCOT), which is also the most time-consuming module by occupying almost 37% of the encoding time. There have been many research activities in the optimization of EBCOT on platforms like FPGA and VLSI, but on Graphics Processing Unit (GPU), a currently popular parallel computing platform in post-production of motion pictures, still few works have been done. This paper proposes a fine-granular parallel EBCOT by re-designing the highly serialized bit-plane coding to a parallel structure where the coding of all bits in a bit-plane could be performed in parallel, then the bit coding tasks can be distributed to the stream processors in GPU by taking advantage of the programming and memory model of CUDA. Experimental results show that our algorithms reveal 3 to 4 times computational speed improvement on an ordinary GPU compared to that on CPU.\",\"PeriodicalId\":273567,\"journal\":{\"name\":\"2012 IEEE International Conference on Multimedia and Expo\",\"volume\":\"56 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-07-09\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 IEEE International Conference on Multimedia and Expo\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICME.2012.115\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE International Conference on Multimedia and Expo","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICME.2012.115","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Fine-Granular Parallel EBCOT and Optimization with CUDA for Digital Cinema Image Compression
JPEG2000 has been accepted by The Society of Motion Picture and Television Engineers (SMPTE) as the image compression standard for the digital distribution of motion pictures. In JPEG2000, the biggest contribution to the coding performance comes from the Embedded Block Coding with Optimized Truncation (EBCOT), which is also the most time-consuming module by occupying almost 37% of the encoding time. There have been many research activities in the optimization of EBCOT on platforms like FPGA and VLSI, but on Graphics Processing Unit (GPU), a currently popular parallel computing platform in post-production of motion pictures, still few works have been done. This paper proposes a fine-granular parallel EBCOT by re-designing the highly serialized bit-plane coding to a parallel structure where the coding of all bits in a bit-plane could be performed in parallel, then the bit coding tasks can be distributed to the stream processors in GPU by taking advantage of the programming and memory model of CUDA. Experimental results show that our algorithms reveal 3 to 4 times computational speed improvement on an ordinary GPU compared to that on CPU.