{"title":"Exploiting GPU Direct Access to Non-Volatile Memory to Accelerate Big Data Processing","authors":"Mahsa Bayati, M. Leeser, N. Mi","doi":"10.1109/HPEC43674.2020.9286174","DOIUrl":null,"url":null,"abstract":"The amount of data being collected for analysis is growing at an exponential rate. Along with this growth comes increasing necessity for computation and storage. Researchers are addressing these needs by building heterogeneous clusters with CPUs and computational accelerators such as GPUs equipped with high I/O bandwidth storage devices. One of the main bottlenecks of such heterogeneous systems is the data transfer bandwidth to GPUs when running I/O intensive applications. The traditional approach gets data from storage to the host memory and then transfers it to the GPU, which can limit data throughput and processing and thus degrade the end-to-end performance. In this paper, we propose a new framework to address the above issue by exploiting Peer-to-Peer Direct Memory Access to allow GPU direct access of the storage device and thus enhance the performance for parallel data processing applications in a heterogeneous big-data platform. Our heterogeneous cluster is supplied with CPUs and GPUs as computing resources and Non-Volatile Memory express (NVMe) drives as storage resources. We deploy an Apache Spark platform to execute representative data processing workloads over this heterogeneous cluster and then adopt Peer-to-Peer Direct Memory Access to connect GPUs to non-volatile storage directly to optimize the GPU data access. Experimental results reveal that this heterogeneous Spark platform successfully bypasses the host memory and enables GPUs to communicate directly to the NVMe drive, thus achieving higher data transfer throughput and improving both data communication time and end-to-end nerformance by 20%.","PeriodicalId":168544,"journal":{"name":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE High Performance Extreme Computing Conference (HPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPEC43674.2020.9286174","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
The amount of data being collected for analysis is growing at an exponential rate. Along with this growth comes increasing necessity for computation and storage. Researchers are addressing these needs by building heterogeneous clusters with CPUs and computational accelerators such as GPUs equipped with high I/O bandwidth storage devices. One of the main bottlenecks of such heterogeneous systems is the data transfer bandwidth to GPUs when running I/O intensive applications. The traditional approach gets data from storage to the host memory and then transfers it to the GPU, which can limit data throughput and processing and thus degrade the end-to-end performance. In this paper, we propose a new framework to address the above issue by exploiting Peer-to-Peer Direct Memory Access to allow GPU direct access of the storage device and thus enhance the performance for parallel data processing applications in a heterogeneous big-data platform. Our heterogeneous cluster is supplied with CPUs and GPUs as computing resources and Non-Volatile Memory express (NVMe) drives as storage resources. We deploy an Apache Spark platform to execute representative data processing workloads over this heterogeneous cluster and then adopt Peer-to-Peer Direct Memory Access to connect GPUs to non-volatile storage directly to optimize the GPU data access. Experimental results reveal that this heterogeneous Spark platform successfully bypasses the host memory and enables GPUs to communicate directly to the NVMe drive, thus achieving higher data transfer throughput and improving both data communication time and end-to-end nerformance by 20%.
收集用于分析的数据量正以指数速度增长。随着这种增长,对计算和存储的需求也越来越大。研究人员正在通过构建带有cpu和计算加速器(如配备高I/O带宽存储设备的gpu)的异构集群来解决这些需求。这种异构系统的主要瓶颈之一是运行I/O密集型应用程序时到gpu的数据传输带宽。传统的方法是将数据从存储到主机内存,然后再传输到GPU,这可能会限制数据吞吐量和处理,从而降低端到端性能。在本文中,我们提出了一个新的框架来解决上述问题,通过利用点对点直接内存访问,允许GPU直接访问存储设备,从而提高异构大数据平台中并行数据处理应用程序的性能。我们的异构集群提供cpu和gpu作为计算资源,非易失性内存(NVMe)驱动器作为存储资源。我们部署了一个Apache Spark平台,在这个异构集群上执行具有代表性的数据处理工作负载,然后采用Peer-to-Peer Direct Memory Access将GPU直接连接到非易失性存储,以优化GPU的数据访问。实验结果表明,该异构Spark平台成功绕过主机内存,使gpu能够直接与NVMe驱动器通信,从而实现更高的数据传输吞吐量,并将数据通信时间和端到端性能提高20%。