M. Mohamed, Bahri Nejmeddine, Batel Noureddine, Toubal Abdelmoughni, Masmoudi Nouri
{"title":"基于异构多核平台的HEVC帧内编码帧级并行性能评价","authors":"M. Mohamed, Bahri Nejmeddine, Batel Noureddine, Toubal Abdelmoughni, Masmoudi Nouri","doi":"10.1109/ICASS.2018.8652076","DOIUrl":null,"url":null,"abstract":"High Efficiency Video Coding (HEVC) is the latest video coding standard released as a successor of H.264/AVC, it expected to reduce the bitrate by 50% for the same perceptual quality. One of the major contributors to the higher compression performance of HEVC is the introduction of larger Coding Units (CU) with recursive partitioning mechanisms. This achievement in performance is accompanied by a high computational complexity, making this new standard very difficult to be embedded in current multimedia services and broadcast platforms. In this paper, a performance evaluation of All-Intra (AI) parallel realization of HEVC encoder is proposed, using a heterogeneous Octa-core CubieBoard4 platform that includes two quad-core ARM A7 and ARM A15. We used the OpenMP paradigm for parallel realization where each thread is assigned to a core processor to encode a separate frame. AI configuration is used to break coding dependencies between successive frames, which allow the parallel processing of a set of images. Experimental results shows that the proposed parallel realization of HEVC encoder, using eight threads, reduces the computational complexity to about 4.35×, without any loss in coding performance. These results do not match to the expected acceleration due to the heterogeneity of the platform.","PeriodicalId":358814,"journal":{"name":"2018 International Conference on Applied Smart Systems (ICASS)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Performance Evaluation of Frame-level Parallelization in HEVC Intra Coding Using Heterogeneous Multicore Platforms\",\"authors\":\"M. Mohamed, Bahri Nejmeddine, Batel Noureddine, Toubal Abdelmoughni, Masmoudi Nouri\",\"doi\":\"10.1109/ICASS.2018.8652076\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"High Efficiency Video Coding (HEVC) is the latest video coding standard released as a successor of H.264/AVC, it expected to reduce the bitrate by 50% for the same perceptual quality. One of the major contributors to the higher compression performance of HEVC is the introduction of larger Coding Units (CU) with recursive partitioning mechanisms. This achievement in performance is accompanied by a high computational complexity, making this new standard very difficult to be embedded in current multimedia services and broadcast platforms. In this paper, a performance evaluation of All-Intra (AI) parallel realization of HEVC encoder is proposed, using a heterogeneous Octa-core CubieBoard4 platform that includes two quad-core ARM A7 and ARM A15. We used the OpenMP paradigm for parallel realization where each thread is assigned to a core processor to encode a separate frame. AI configuration is used to break coding dependencies between successive frames, which allow the parallel processing of a set of images. Experimental results shows that the proposed parallel realization of HEVC encoder, using eight threads, reduces the computational complexity to about 4.35×, without any loss in coding performance. These results do not match to the expected acceleration due to the heterogeneity of the platform.\",\"PeriodicalId\":358814,\"journal\":{\"name\":\"2018 International Conference on Applied Smart Systems (ICASS)\",\"volume\":\"40 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2018-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2018 International Conference on Applied Smart Systems (ICASS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICASS.2018.8652076\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 International Conference on Applied Smart Systems (ICASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICASS.2018.8652076","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
摘要
高效视频编码(High Efficiency Video Coding, HEVC)是继H.264/AVC之后发布的最新视频编码标准,它有望在相同的感知质量下将比特率降低50%。HEVC压缩性能提高的主要原因之一是引入了带有递归分区机制的更大的编码单元(CU)。这种性能上的成就伴随着较高的计算复杂度,使得这个新标准很难嵌入到当前的多媒体服务和广播平台中。本文利用异构八核CubieBoard4平台(包含两个四核ARM A7和ARM A15),对HEVC编码器的AI并行实现进行了性能评估。我们使用OpenMP范式进行并行实现,其中每个线程被分配给一个核心处理器来编码一个单独的帧。AI配置用于打破连续帧之间的编码依赖,从而允许并行处理一组图像。实验结果表明,采用8个线程并行实现HEVC编码器,在不影响编码性能的前提下,将计算复杂度降低到4.35倍左右。由于平台的不均匀性,这些结果与预期的加速度不匹配。
Performance Evaluation of Frame-level Parallelization in HEVC Intra Coding Using Heterogeneous Multicore Platforms
High Efficiency Video Coding (HEVC) is the latest video coding standard released as a successor of H.264/AVC, it expected to reduce the bitrate by 50% for the same perceptual quality. One of the major contributors to the higher compression performance of HEVC is the introduction of larger Coding Units (CU) with recursive partitioning mechanisms. This achievement in performance is accompanied by a high computational complexity, making this new standard very difficult to be embedded in current multimedia services and broadcast platforms. In this paper, a performance evaluation of All-Intra (AI) parallel realization of HEVC encoder is proposed, using a heterogeneous Octa-core CubieBoard4 platform that includes two quad-core ARM A7 and ARM A15. We used the OpenMP paradigm for parallel realization where each thread is assigned to a core processor to encode a separate frame. AI configuration is used to break coding dependencies between successive frames, which allow the parallel processing of a set of images. Experimental results shows that the proposed parallel realization of HEVC encoder, using eight threads, reduces the computational complexity to about 4.35×, without any loss in coding performance. These results do not match to the expected acceleration due to the heterogeneity of the platform.