Chaolong Zhang, Yuanping Xu, Jia He, Jun Lu, Li Lu, Zhijie Xu
{"title":"多gpu高斯滤波实时大数据处理","authors":"Chaolong Zhang, Yuanping Xu, Jia He, Jun Lu, Li Lu, Zhijie Xu","doi":"10.1109/SKIMA.2016.7916225","DOIUrl":null,"url":null,"abstract":"Gaussian filtering has been extensively used in the field of surface metrology. However, the computing performance becomes a core bottleneck for Gaussian filtering algorithm based applications when facing large-scale and/or real-time data processing. Although researchers tried to accelerate Gaussian filtering algorithm by using GPU (Graphics Processing Unit), single GPU still fail to meet the large-scale and real-time requirements of surface texture micro- and nano-measurements. Therefore, to solve this bottleneck problem, this paper proposes a single node multi-GPUs based computing framework to accelerate the 2D Gaussian filtering algorithm. This paper presents that the devised framework seamlessly integrated the multi-level spatial domain decomposition method and the CUDA stream mechanism to overlap the two main time consuming steps, i.e., the data transfer and GPU kernel execution, such that it can increase concurrency and reduce the overall running time. This paper also tests and evaluates the proposed computing framework with other three conventional solutions by using large-scale measured data extracted from real mechanical surfaces, and the final results show that the proposed framework achieved higher efficiency. It also proved that this framework satisfies the real-time and big data requirements in micro- and nano-surface texture measurement.","PeriodicalId":417370,"journal":{"name":"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Multi-GPUs Gaussian filtering for real-time big data processing\",\"authors\":\"Chaolong Zhang, Yuanping Xu, Jia He, Jun Lu, Li Lu, Zhijie Xu\",\"doi\":\"10.1109/SKIMA.2016.7916225\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Gaussian filtering has been extensively used in the field of surface metrology. However, the computing performance becomes a core bottleneck for Gaussian filtering algorithm based applications when facing large-scale and/or real-time data processing. Although researchers tried to accelerate Gaussian filtering algorithm by using GPU (Graphics Processing Unit), single GPU still fail to meet the large-scale and real-time requirements of surface texture micro- and nano-measurements. Therefore, to solve this bottleneck problem, this paper proposes a single node multi-GPUs based computing framework to accelerate the 2D Gaussian filtering algorithm. This paper presents that the devised framework seamlessly integrated the multi-level spatial domain decomposition method and the CUDA stream mechanism to overlap the two main time consuming steps, i.e., the data transfer and GPU kernel execution, such that it can increase concurrency and reduce the overall running time. This paper also tests and evaluates the proposed computing framework with other three conventional solutions by using large-scale measured data extracted from real mechanical surfaces, and the final results show that the proposed framework achieved higher efficiency. It also proved that this framework satisfies the real-time and big data requirements in micro- and nano-surface texture measurement.\",\"PeriodicalId\":417370,\"journal\":{\"name\":\"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1900-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SKIMA.2016.7916225\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 10th International Conference on Software, Knowledge, Information Management & Applications (SKIMA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SKIMA.2016.7916225","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Multi-GPUs Gaussian filtering for real-time big data processing
Gaussian filtering has been extensively used in the field of surface metrology. However, the computing performance becomes a core bottleneck for Gaussian filtering algorithm based applications when facing large-scale and/or real-time data processing. Although researchers tried to accelerate Gaussian filtering algorithm by using GPU (Graphics Processing Unit), single GPU still fail to meet the large-scale and real-time requirements of surface texture micro- and nano-measurements. Therefore, to solve this bottleneck problem, this paper proposes a single node multi-GPUs based computing framework to accelerate the 2D Gaussian filtering algorithm. This paper presents that the devised framework seamlessly integrated the multi-level spatial domain decomposition method and the CUDA stream mechanism to overlap the two main time consuming steps, i.e., the data transfer and GPU kernel execution, such that it can increase concurrency and reduce the overall running time. This paper also tests and evaluates the proposed computing framework with other three conventional solutions by using large-scale measured data extracted from real mechanical surfaces, and the final results show that the proposed framework achieved higher efficiency. It also proved that this framework satisfies the real-time and big data requirements in micro- and nano-surface texture measurement.