多gpu高斯滤波实时大数据处理

Chaolong Zhang, Yuanping Xu, Jia He, Jun Lu, Li Lu, Zhijie Xu
{"title":"多gpu高斯滤波实时大数据处理","authors":"Chaolong Zhang, Yuanping Xu, Jia He, Jun Lu, Li Lu, Zhijie Xu","doi":"10.1109/SKIMA.2016.7916225","DOIUrl":null,"url":null,"abstract":"Gaussian filtering has been extensively used in the field of surface metrology. However, the computing performance becomes a core bottleneck for Gaussian filtering algorithm based applications when facing large-scale and/or real-time data processing. Although researchers tried to accelerate Gaussian filtering algorithm by using GPU (Graphics Processing Unit), single GPU still fail to meet the large-scale and real-time requirements of surface texture micro- and nano-measurements. Therefore, to solve this bottleneck problem, this paper proposes a single node multi-GPUs based computing framework to accelerate the 2D Gaussian filtering algorithm. This paper presents that the devised framework seamlessly integrated the multi-level spatial domain decomposition method and the CUDA stream mechanism to overlap the two main time consuming steps, i.e., the data transfer and GPU kernel execution, such that it can increase concurrency and reduce the overall running time. This paper also tests and evaluates the proposed computing framework with other three conventional solutions by using large-scale measured data extracted from real mechanical surfaces, and the final results show that the proposed framework achieved higher efficiency. It also proved that this framework satisfies the real-time and big data requirements in micro- and nano-surface texture measurement.","PeriodicalId":417370,"journal":{"name":"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Multi-GPUs Gaussian filtering for real-time big data processing\",\"authors\":\"Chaolong Zhang, Yuanping Xu, Jia He, Jun Lu, Li Lu, Zhijie Xu\",\"doi\":\"10.1109/SKIMA.2016.7916225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gaussian filtering has been extensively used in the field of surface metrology. However, the computing performance becomes a core bottleneck for Gaussian filtering algorithm based applications when facing large-scale and/or real-time data processing. Although researchers tried to accelerate Gaussian filtering algorithm by using GPU (Graphics Processing Unit), single GPU still fail to meet the large-scale and real-time requirements of surface texture micro- and nano-measurements. Therefore, to solve this bottleneck problem, this paper proposes a single node multi-GPUs based computing framework to accelerate the 2D Gaussian filtering algorithm. This paper presents that the devised framework seamlessly integrated the multi-level spatial domain decomposition method and the CUDA stream mechanism to overlap the two main time consuming steps, i.e., the data transfer and GPU kernel execution, such that it can increase concurrency and reduce the overall running time. This paper also tests and evaluates the proposed computing framework with other three conventional solutions by using large-scale measured data extracted from real mechanical surfaces, and the final results show that the proposed framework achieved higher efficiency. It also proved that this framework satisfies the real-time and big data requirements in micro- and nano-surface texture measurement.\",\"PeriodicalId\":417370,\"journal\":{\"name\":\"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SKIMA.2016.7916225\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SKIMA.2016.7916225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

摘要

高斯滤波在曲面测量领域得到了广泛的应用。然而,当面对大规模或实时的数据处理时,计算性能成为基于高斯滤波算法的应用的核心瓶颈。尽管研究者尝试使用GPU (Graphics Processing Unit)来加速高斯滤波算法,但单个GPU仍然无法满足表面纹理微纳米测量的大规模和实时性要求。因此,为了解决这一瓶颈问题,本文提出了一种基于单节点多gpu的计算框架来加速二维高斯滤波算法。本文提出设计的框架将多层空间域分解方法与CUDA流机制无缝集成,重叠了数据传输和GPU内核执行这两个主要耗时步骤,从而提高了并发性,减少了整体运行时间。本文还利用从实际机械表面提取的大规模测量数据,对所提出的计算框架与其他三种常规解决方案进行了测试和评估,最终结果表明所提出的计算框架具有更高的效率。实验证明,该框架能够满足微纳米表面纹理测量的实时性和大数据要求。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Multi-GPUs Gaussian filtering for real-time big data processing
Gaussian filtering has been extensively used in the field of surface metrology. However, the computing performance becomes a core bottleneck for Gaussian filtering algorithm based applications when facing large-scale and/or real-time data processing. Although researchers tried to accelerate Gaussian filtering algorithm by using GPU (Graphics Processing Unit), single GPU still fail to meet the large-scale and real-time requirements of surface texture micro- and nano-measurements. Therefore, to solve this bottleneck problem, this paper proposes a single node multi-GPUs based computing framework to accelerate the 2D Gaussian filtering algorithm. This paper presents that the devised framework seamlessly integrated the multi-level spatial domain decomposition method and the CUDA stream mechanism to overlap the two main time consuming steps, i.e., the data transfer and GPU kernel execution, such that it can increase concurrency and reduce the overall running time. This paper also tests and evaluates the proposed computing framework with other three conventional solutions by using large-scale measured data extracted from real mechanical surfaces, and the final results show that the proposed framework achieved higher efficiency. It also proved that this framework satisfies the real-time and big data requirements in micro- and nano-surface texture measurement.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术官方微信