Exploiting NVM in large-scale graph analytics

INFLOW '15 Pub Date : 2015-10-04 DOI:10.1145/2819001.2819005

Jasmina Malicevic, Subramanya R. Dulloor, N. Sundaram, N. Satish, Jeffrey R. Jackson, W. Zwaenepoel

{"title":"Exploiting NVM in large-scale graph analytics","authors":"Jasmina Malicevic, Subramanya R. Dulloor, N. Sundaram, N. Satish, Jeffrey R. Jackson, W. Zwaenepoel","doi":"10.1145/2819001.2819005","DOIUrl":null,"url":null,"abstract":"Data center applications like graph analytics require servers with ever larger memory capacities. DRAM scaling, however, is not able to match the increasing demands for capacity. Emerging byte-addressable, non-volatile memory technologies (NVM) offer a more scalable alternative, with memory that is directly addressable to software, but at a higher latency and lower bandwidth.\n Using an NVM hardware emulator, we study the suitability of NVM in meeting the memory demands of four state of the art graph analytics frameworks, namely Graphlab, Galois, X-Stream and Graphmat. We evaluate their performance with popular algorithms (Pagerank, BFS, Triangle Counting and Collaborative filtering) by allocating memory exclusive from DRAM (DRAM-only) or emulated NVM (NVM-only).\n While all of these applications are sensitive to higher latency or lower bandwidth of NVM, resulting in performance degradation of up to 4x with NVM-only (compared to DRAM-only), we show that the performance impact is somewhat mitigated in the frameworks that exploit CPU memory-level parallelism and hardware prefetchers.\n Further, we demonstrate that, in a hybrid memory system with NVM and DRAM, intelligent placement of application data based on their relative importance may help offset the overheads of the NVM-only solution in a cost-effective manner (i.e., using only a small amount of DRAM). Specifically, we show that, depending on the algorithm, Graphmat can achieve close to DRAM-only performance (within 1.2x) by placing only 6.7% to 31.5% of its total memory footprint in DRAM.","PeriodicalId":293142,"journal":{"name":"INFLOW '15","volume":"212 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"INFLOW '15","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2819001.2819005","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 29

Abstract

Data center applications like graph analytics require servers with ever larger memory capacities. DRAM scaling, however, is not able to match the increasing demands for capacity. Emerging byte-addressable, non-volatile memory technologies (NVM) offer a more scalable alternative, with memory that is directly addressable to software, but at a higher latency and lower bandwidth. Using an NVM hardware emulator, we study the suitability of NVM in meeting the memory demands of four state of the art graph analytics frameworks, namely Graphlab, Galois, X-Stream and Graphmat. We evaluate their performance with popular algorithms (Pagerank, BFS, Triangle Counting and Collaborative filtering) by allocating memory exclusive from DRAM (DRAM-only) or emulated NVM (NVM-only). While all of these applications are sensitive to higher latency or lower bandwidth of NVM, resulting in performance degradation of up to 4x with NVM-only (compared to DRAM-only), we show that the performance impact is somewhat mitigated in the frameworks that exploit CPU memory-level parallelism and hardware prefetchers. Further, we demonstrate that, in a hybrid memory system with NVM and DRAM, intelligent placement of application data based on their relative importance may help offset the overheads of the NVM-only solution in a cost-effective manner (i.e., using only a small amount of DRAM). Specifically, we show that, depending on the algorithm, Graphmat can achieve close to DRAM-only performance (within 1.2x) by placing only 6.7% to 31.5% of its total memory footprint in DRAM.

查看原文本刊更多论文

在大规模图分析中利用NVM

像图形分析这样的数据中心应用程序需要具有更大内存容量的服务器。然而，DRAM的扩展无法满足日益增长的容量需求。新兴的字节可寻址、非易失性内存技术(NVM)提供了一种更具可扩展性的替代方案，其内存可直接对软件进行寻址，但具有更高的延迟和更低的带宽。使用NVM硬件模拟器，我们研究了NVM在满足四种最先进的图形分析框架(即Graphlab, Galois, X-Stream和Graphmat)的内存需求方面的适用性。我们通过分配DRAM(仅限DRAM)或模拟NVM(仅限NVM)的内存，用流行的算法(Pagerank, BFS，三角形计数和协同过滤)评估它们的性能。虽然所有这些应用程序都对更高的延迟或更低的NVM带宽很敏感，导致仅使用NVM(与仅使用dram相比)的性能下降高达4倍，但我们表明，在利用CPU内存级并行性和硬件预取器的框架中，性能影响有所减轻。此外，我们证明，在具有NVM和DRAM的混合内存系统中，基于应用程序数据的相对重要性的智能放置可能有助于以经济有效的方式抵消仅NVM解决方案的开销(即仅使用少量DRAM)。具体来说，我们表明，根据算法的不同，Graphmat通过仅将其总内存占用的6.7%至31.5%放置在DRAM中，可以实现接近仅使用DRAM的性能(在1.2倍以内)。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

INFLOW '15

自引率

0.00%

发文量