Performance Analysis of OpenCL and CUDA Programming Models for the High Efficiency Video Coding

Randa Khemiri, Soulef Bouaafia, Asma Bahba, Maha Nasr, Fatma Ezahra Sayadi
{"title":"Performance Analysis of OpenCL and CUDA Programming Models for the High Efficiency Video Coding","authors":"Randa Khemiri, Soulef Bouaafia, Asma Bahba, Maha Nasr, Fatma Ezahra Sayadi","doi":"10.5772/intechopen.99823","DOIUrl":null,"url":null,"abstract":"In Motion estimation (ME), the block matching algorithms have a great potential of parallelism. This process of the best match is performed by computing the similarity for each block position inside the search area, using a similarity metric, such as Sum of Absolute Differences (SAD). It is used in the various steps of motion estimation algorithms. Moreover, it can be parallelized using Graphics Processing Unit (GPU) since the computation algorithm of each block pixels is similar, thus offering better results. In this work a fixed OpenCL code was performed firstly on several architectures as CPU and GPU, secondly a parallel GPU-implementation was proposed with CUDA and OpenCL for the SAD process using block of sizes from 4x4 to 64x64. A comparative study established between execution time on GPU on the same video sequence. The experimental results indicated that GPU OpenCL execution time was better than that of CUDA times with performance ratio that reached the double.","PeriodicalId":135831,"journal":{"name":"Digital Image Processing - Advances and Applications [Working Title]","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-10-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Digital Image Processing - Advances and Applications [Working Title]","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5772/intechopen.99823","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

In Motion estimation (ME), the block matching algorithms have a great potential of parallelism. This process of the best match is performed by computing the similarity for each block position inside the search area, using a similarity metric, such as Sum of Absolute Differences (SAD). It is used in the various steps of motion estimation algorithms. Moreover, it can be parallelized using Graphics Processing Unit (GPU) since the computation algorithm of each block pixels is similar, thus offering better results. In this work a fixed OpenCL code was performed firstly on several architectures as CPU and GPU, secondly a parallel GPU-implementation was proposed with CUDA and OpenCL for the SAD process using block of sizes from 4x4 to 64x64. A comparative study established between execution time on GPU on the same video sequence. The experimental results indicated that GPU OpenCL execution time was better than that of CUDA times with performance ratio that reached the double.
基于OpenCL和CUDA编程模型的高效视频编码性能分析
在运动估计中,块匹配算法具有很大的并行性潜力。这个最佳匹配的过程是通过计算搜索区域内每个块位置的相似度来执行的,使用相似度度量,例如绝对差异和(SAD)。它被用于运动估计算法的各个步骤。此外,由于每个块像素的计算算法相似,可以使用图形处理单元(GPU)并行化,从而提供更好的结果。在这项工作中,首先在CPU和GPU等几种架构上执行了固定的OpenCL代码,然后提出了一个基于CUDA和OpenCL的并行GPU实现,用于SAD进程,使用大小从4x4到64x64的块。对同一视频序列在GPU上的执行时间进行了比较研究。实验结果表明,GPU OpenCL的执行时间优于CUDA的执行时间,性能比达到两倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信