Optimizing the Census Transform on CUDA enabled GPUs

C. Pantilie, S. Nedevschi
{"title":"Optimizing the Census Transform on CUDA enabled GPUs","authors":"C. Pantilie, S. Nedevschi","doi":"10.1109/ICCP.2012.6356186","DOIUrl":null,"url":null,"abstract":"The Census Transform is one of the most widely used matching metrics in problems that involve correspondence search such as stereo reconstruction and optical flow. Graphic processing units (GPUs) have become popular platforms for such computation intensive applications that expose a high degree of data parallelism. Their evolution as a platform for general purpose computing by continuously adding new hardware features has improved performance for many applications but it has also expanded the set of possible implementations choices up to the point where guidelines alone are not sufficient for optimum performance. What is the best implementation in the case of the Census Transform? This paper will answer that question by benchmarking all major possible implementations. Its aim is to provide an optimal implementation of the Census Transform on a current generation graphics processing unit using the Compute Unified Device Architecture (CUDA). The results have value reaching far beyond the Census Transform and provide insight for applications where non-separable 2D convolutions are present.","PeriodicalId":406461,"journal":{"name":"2012 IEEE 8th International Conference on Intelligent Computer Communication and Processing","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"7","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 IEEE 8th International Conference on Intelligent Computer Communication and Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCP.2012.6356186","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 7

Abstract

The Census Transform is one of the most widely used matching metrics in problems that involve correspondence search such as stereo reconstruction and optical flow. Graphic processing units (GPUs) have become popular platforms for such computation intensive applications that expose a high degree of data parallelism. Their evolution as a platform for general purpose computing by continuously adding new hardware features has improved performance for many applications but it has also expanded the set of possible implementations choices up to the point where guidelines alone are not sufficient for optimum performance. What is the best implementation in the case of the Census Transform? This paper will answer that question by benchmarking all major possible implementations. Its aim is to provide an optimal implementation of the Census Transform on a current generation graphics processing unit using the Compute Unified Device Architecture (CUDA). The results have value reaching far beyond the Census Transform and provide insight for applications where non-separable 2D convolutions are present.
在支持CUDA的gpu上优化普查变换
普查变换是在立体重建和光流等对应搜索问题中应用最广泛的匹配度量之一。图形处理单元(gpu)已经成为这种计算密集型应用程序的流行平台,这些应用程序暴露了高度的数据并行性。通过不断添加新的硬件特性,它们作为通用计算平台的发展已经提高了许多应用程序的性能,但它也扩展了可能的实现选择集,以至于仅凭指导方针不足以实现最佳性能。在人口普查转型的情况下,什么是最好的实施?本文将通过对所有可能的主要实现进行基准测试来回答这个问题。其目的是使用计算统一设备架构(CUDA)在当前一代图形处理单元上提供人口普查变换的最佳实现。结果的价值远远超出了人口普查变换,并为存在不可分离二维卷积的应用提供了见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 求助全文
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信