FlashWalker: An In-Storage Accelerator for Graph Random Walks

2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2022-05-01 DOI:10.1109/ipdps53621.2022.00107

Fuping Niu, Jianhui Yue, Jiangqiu Shen, Xiaofei Liao, Haikun Liu, Hai Jin

{"title":"FlashWalker: An In-Storage Accelerator for Graph Random Walks","authors":"Fuping Niu, Jianhui Yue, Jiangqiu Shen, Xiaofei Liao, Haikun Liu, Hai Jin","doi":"10.1109/ipdps53621.2022.00107","DOIUrl":null,"url":null,"abstract":"Graph random walk is widely used in the graph processing as it is a fundamental component in graph analysis, ranging from vertices ranking to the graph embedding. Different from traditional graph processing workload, random walk features massive processing parallelisms and poor graph data reuse, being limited by low I/O efficiency. Prior designs for random walk mitigate slow I/O operations. However, the state-of-the-art random walk processing systems are bounded by slow disk I/O bandwidth, which is confirmed by our experiments with real-world graphs. To address this issue, we propose FlashWalker, an in-storage accelerator for random walk that moves walk updating close to graph data stored in flash memory, by exploiting significant parallelisms inside SSD. Featuring a heterogeneous and parallel processing system, FlashWalker includes a board-level accelerator, channel-level accelerators, and chip-level accelerators. To address challenges posed by the tight resource constraints for processing large-scale graphs, we propose novel designs: storing a few popular subgraphs in accelerators, the pre-walking for dense walks, two optimizations to search the subgraph mapping table, and a subgraph scheduling algorithm. We implement FlashWalker in RTL, showing small circuit area overhead. Our evaluation shows FlashWalker reduces the execution time of random walk algorithms by up to 660.50×, compared with GraphWalker, which is the state-of-the-art system for random walk algorithms.","PeriodicalId":321801,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ipdps53621.2022.00107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Graph random walk is widely used in the graph processing as it is a fundamental component in graph analysis, ranging from vertices ranking to the graph embedding. Different from traditional graph processing workload, random walk features massive processing parallelisms and poor graph data reuse, being limited by low I/O efficiency. Prior designs for random walk mitigate slow I/O operations. However, the state-of-the-art random walk processing systems are bounded by slow disk I/O bandwidth, which is confirmed by our experiments with real-world graphs. To address this issue, we propose FlashWalker, an in-storage accelerator for random walk that moves walk updating close to graph data stored in flash memory, by exploiting significant parallelisms inside SSD. Featuring a heterogeneous and parallel processing system, FlashWalker includes a board-level accelerator, channel-level accelerators, and chip-level accelerators. To address challenges posed by the tight resource constraints for processing large-scale graphs, we propose novel designs: storing a few popular subgraphs in accelerators, the pre-walking for dense walks, two optimizations to search the subgraph mapping table, and a subgraph scheduling algorithm. We implement FlashWalker in RTL, showing small circuit area overhead. Our evaluation shows FlashWalker reduces the execution time of random walk algorithms by up to 660.50×, compared with GraphWalker, which is the state-of-the-art system for random walk algorithms.

查看原文本刊更多论文

FlashWalker:用于图形随机漫步的存储加速器

图随机漫步作为图分析的基本组成部分，从顶点排序到图嵌入，在图处理中得到了广泛的应用。与传统的图处理工作量不同，随机漫步具有处理并行性大、图数据重用性差的特点，受I/O效率低等限制。先前的随机漫步设计减轻了缓慢的I/O操作。然而，最先进的随机漫步处理系统受到慢速磁盘I/O带宽的限制，我们对真实世界图形的实验证实了这一点。为了解决这个问题，我们提出了FlashWalker，这是一种随机游动的存储加速器，通过利用SSD内部的显著并行性，将游动更新移动到接近存储在闪存中的图形数据。FlashWalker具有异构和并行处理系统，包括板级加速器，通道级加速器和芯片级加速器。为了解决处理大规模图的严格资源约束带来的挑战，我们提出了新的设计:在加速器中存储一些流行的子图，密集行走的预行走，搜索子图映射表的两种优化以及子图调度算法。我们在RTL中实现FlashWalker，显示出较小的电路面积开销。我们的评估表明，与GraphWalker相比，FlashWalker将随机漫步算法的执行时间减少了660.50倍，GraphWalker是最先进的随机漫步算法系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量