基于gpu的超声仿真数据压缩加速

2014 IEEE International Parallel & Distributed Processing Symposium Workshops Pub Date : 2014-05-19 DOI:10.1109/IPDPSW.2014.140

Andrew A. Haigh, Eric C. McCreath

{"title":"基于gpu的超声仿真数据压缩加速","authors":"Andrew A. Haigh, Eric C. McCreath","doi":"10.1109/IPDPSW.2014.140","DOIUrl":null,"url":null,"abstract":"The realistic simulation of ultrasound wave propagation is computationally intensive. The large size of the grid and low degree of reuse of data means that it places a great demand on memory bandwidth. Graphics Processing Units (GPUs) have attracted attention for performing scientific calculations due to their potential for efficiently performing large numbers of floating point computations. However, many applications may be limited by memory bandwidth, especially for data sets whose size is larger than that of the GPU platform. This problem is only partially mitigated by applying the standard technique of breaking the grid into regions and overlapping the computation of one region with the host-device memory transfer of another. In this paper, we implement a memory-bound GPU-based ultrasound simulation and evaluate the use of a technique for improving performance by compressing the data into a fixed-point representation that reduces the time required for inter-host-device transfers. We demonstrate a speedup of 1.5 times on a simulation where the data is broken into regions that must be copied back and forth between the CPU and GPU. We develop a model that can be used to determine the amount of temporal blocking required to achieve near optimal performance, without extensive experimentation. This technique may also be applied to GPU-based scientific simulations in other domains such as computational fluid dynamics and electromagnetic wave simulation.","PeriodicalId":153864,"journal":{"name":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","volume":"61 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-05-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Acceleration of GPU-Based Ultrasound Simulation via Data Compression\",\"authors\":\"Andrew A. Haigh, Eric C. McCreath\",\"doi\":\"10.1109/IPDPSW.2014.140\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The realistic simulation of ultrasound wave propagation is computationally intensive. The large size of the grid and low degree of reuse of data means that it places a great demand on memory bandwidth. Graphics Processing Units (GPUs) have attracted attention for performing scientific calculations due to their potential for efficiently performing large numbers of floating point computations. However, many applications may be limited by memory bandwidth, especially for data sets whose size is larger than that of the GPU platform. This problem is only partially mitigated by applying the standard technique of breaking the grid into regions and overlapping the computation of one region with the host-device memory transfer of another. In this paper, we implement a memory-bound GPU-based ultrasound simulation and evaluate the use of a technique for improving performance by compressing the data into a fixed-point representation that reduces the time required for inter-host-device transfers. We demonstrate a speedup of 1.5 times on a simulation where the data is broken into regions that must be copied back and forth between the CPU and GPU. We develop a model that can be used to determine the amount of temporal blocking required to achieve near optimal performance, without extensive experimentation. This technique may also be applied to GPU-based scientific simulations in other domains such as computational fluid dynamics and electromagnetic wave simulation.\",\"PeriodicalId\":153864,\"journal\":{\"name\":\"2014 IEEE International Parallel & Distributed Processing Symposium Workshops\",\"volume\":\"61 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-05-19\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2014 IEEE International Parallel & Distributed Processing Symposium Workshops\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPDPSW.2014.140\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE International Parallel & Distributed Processing Symposium Workshops","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW.2014.140","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

超声波传播的真实模拟需要大量的计算量。网格的大尺寸和数据的低重用程度意味着它对内存带宽的需求很大。图形处理单元(gpu)由于具有高效执行大量浮点计算的潜力，在执行科学计算方面引起了人们的注意。但是，许多应用程序可能会受到内存带宽的限制，特别是对于那些大小大于GPU平台的数据集。通过应用将网格划分为区域并将一个区域的计算与另一个区域的主机设备内存传输重叠的标准技术，只能部分缓解这个问题。在本文中，我们实现了一个基于内存绑定gpu的超声模拟，并评估了一种技术的使用，通过将数据压缩成定点表示来提高性能，从而减少了主机设备间传输所需的时间。我们在模拟中演示了1.5倍的加速，其中数据被分成必须在CPU和GPU之间来回复制的区域。我们开发了一个模型，该模型可用于确定实现接近最佳性能所需的时间阻塞量，而无需进行广泛的实验。该技术也可应用于计算流体力学、电磁波模拟等基于gpu的科学模拟领域。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文本刊更多论文

Acceleration of GPU-Based Ultrasound Simulation via Data Compression

The realistic simulation of ultrasound wave propagation is computationally intensive. The large size of the grid and low degree of reuse of data means that it places a great demand on memory bandwidth. Graphics Processing Units (GPUs) have attracted attention for performing scientific calculations due to their potential for efficiently performing large numbers of floating point computations. However, many applications may be limited by memory bandwidth, especially for data sets whose size is larger than that of the GPU platform. This problem is only partially mitigated by applying the standard technique of breaking the grid into regions and overlapping the computation of one region with the host-device memory transfer of another. In this paper, we implement a memory-bound GPU-based ultrasound simulation and evaluate the use of a technique for improving performance by compressing the data into a fixed-point representation that reduces the time required for inter-host-device transfers. We demonstrate a speedup of 1.5 times on a simulation where the data is broken into regions that must be copied back and forth between the CPU and GPU. We develop a model that can be used to determine the amount of temporal blocking required to achieve near optimal performance, without extensive experimentation. This technique may also be applied to GPU-based scientific simulations in other domains such as computational fluid dynamics and electromagnetic wave simulation.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2014 IEEE International Parallel & Distributed Processing Symposium Workshops

自引率

0.00%

发文量