Fuping Niu, Jianhui Yue, Jiangqiu Shen, Xiaofei Liao, Haikun Liu, Hai Jin
{"title":"FlashWalker: An In-Storage Accelerator for Graph Random Walks","authors":"Fuping Niu, Jianhui Yue, Jiangqiu Shen, Xiaofei Liao, Haikun Liu, Hai Jin","doi":"10.1109/ipdps53621.2022.00107","DOIUrl":null,"url":null,"abstract":"Graph random walk is widely used in the graph processing as it is a fundamental component in graph analysis, ranging from vertices ranking to the graph embedding. Different from traditional graph processing workload, random walk features massive processing parallelisms and poor graph data reuse, being limited by low I/O efficiency. Prior designs for random walk mitigate slow I/O operations. However, the state-of-the-art random walk processing systems are bounded by slow disk I/O bandwidth, which is confirmed by our experiments with real-world graphs. To address this issue, we propose FlashWalker, an in-storage accelerator for random walk that moves walk updating close to graph data stored in flash memory, by exploiting significant parallelisms inside SSD. Featuring a heterogeneous and parallel processing system, FlashWalker includes a board-level accelerator, channel-level accelerators, and chip-level accelerators. To address challenges posed by the tight resource constraints for processing large-scale graphs, we propose novel designs: storing a few popular subgraphs in accelerators, the pre-walking for dense walks, two optimizations to search the subgraph mapping table, and a subgraph scheduling algorithm. We implement FlashWalker in RTL, showing small circuit area overhead. Our evaluation shows FlashWalker reduces the execution time of random walk algorithms by up to 660.50×, compared with GraphWalker, which is the state-of-the-art system for random walk algorithms.","PeriodicalId":321801,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ipdps53621.2022.00107","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Graph random walk is widely used in the graph processing as it is a fundamental component in graph analysis, ranging from vertices ranking to the graph embedding. Different from traditional graph processing workload, random walk features massive processing parallelisms and poor graph data reuse, being limited by low I/O efficiency. Prior designs for random walk mitigate slow I/O operations. However, the state-of-the-art random walk processing systems are bounded by slow disk I/O bandwidth, which is confirmed by our experiments with real-world graphs. To address this issue, we propose FlashWalker, an in-storage accelerator for random walk that moves walk updating close to graph data stored in flash memory, by exploiting significant parallelisms inside SSD. Featuring a heterogeneous and parallel processing system, FlashWalker includes a board-level accelerator, channel-level accelerators, and chip-level accelerators. To address challenges posed by the tight resource constraints for processing large-scale graphs, we propose novel designs: storing a few popular subgraphs in accelerators, the pre-walking for dense walks, two optimizations to search the subgraph mapping table, and a subgraph scheduling algorithm. We implement FlashWalker in RTL, showing small circuit area overhead. Our evaluation shows FlashWalker reduces the execution time of random walk algorithms by up to 660.50×, compared with GraphWalker, which is the state-of-the-art system for random walk algorithms.