GNNBoost: Accelerating sampling-based GNN training on large scale graph by optimizing data preparation

IF 4.1 2区计算机科学 Q1 COMPUTER SCIENCE, HARDWARE & ARCHITECTURE

Journal of Systems Architecture Pub Date : 2025-05-30 DOI:10.1016/j.sysarc.2025.103456

Yujuan Tan , Yan Gan , Zhaoyang Zeng , Zhuoxin Bai , Lei Qiao , Duo Liu , Kan Zhong , Ao Ren

{"title":"GNNBoost: Accelerating sampling-based GNN training on large scale graph by optimizing data preparation","authors":"Yujuan Tan , Yan Gan , Zhaoyang Zeng , Zhuoxin Bai , Lei Qiao , Duo Liu , Kan Zhong , Ao Ren","doi":"10.1016/j.sysarc.2025.103456","DOIUrl":null,"url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) have successfully extended deep learning from traditional Euclidean spaces to complex graph structures. Sampling-based GNN training has been widely adopted for large-scale graphs without compromising accuracy. However, the graph irregularity results in imbalanced sampling workloads, making it challenging for existing GNN systems to effectively utilize GPU resources for graph sampling. Additionally, in GNN systems where both topology and feature caches are enabled, differences in characteristics and purposes of cache data complicate the allocation of GPU memory for these two caches with minimal overhead. To address these challenges, we propose GNNBoost, a framework designed to accelerate GNN training. GNNBoost consists of two key innovations. First, GNNBoost introduces a degree-oriented sampling schedule that groups training vertices based on their degrees and applies tailored sampling strategies to balance GPU workloads and improve sampling performance. Second, GNNBoost develops a low-overhead cache space allocation mechanism that accurately determines the optimal cache sizes for graph topology and features across different workloads, minimizing both space and time overheads. We conduct a comprehensive evaluation of GNNBoost through various GNN models and large graph datasets, demonstrating that it significantly outperforms existing GNN training systems.</div></div>","PeriodicalId":50027,"journal":{"name":"Journal of Systems Architecture","volume":"167 ","pages":"Article 103456"},"PeriodicalIF":4.1000,"publicationDate":"2025-05-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Systems Architecture","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1383762125001286","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}

引用次数: 0

Abstract

Graph Neural Networks (GNNs) have successfully extended deep learning from traditional Euclidean spaces to complex graph structures. Sampling-based GNN training has been widely adopted for large-scale graphs without compromising accuracy. However, the graph irregularity results in imbalanced sampling workloads, making it challenging for existing GNN systems to effectively utilize GPU resources for graph sampling. Additionally, in GNN systems where both topology and feature caches are enabled, differences in characteristics and purposes of cache data complicate the allocation of GPU memory for these two caches with minimal overhead. To address these challenges, we propose GNNBoost, a framework designed to accelerate GNN training. GNNBoost consists of two key innovations. First, GNNBoost introduces a degree-oriented sampling schedule that groups training vertices based on their degrees and applies tailored sampling strategies to balance GPU workloads and improve sampling performance. Second, GNNBoost develops a low-overhead cache space allocation mechanism that accurately determines the optimal cache sizes for graph topology and features across different workloads, minimizing both space and time overheads. We conduct a comprehensive evaluation of GNNBoost through various GNN models and large graph datasets, demonstrating that it significantly outperforms existing GNN training systems.

查看原文本刊更多论文

GNNBoost：通过优化数据准备，加速大规模图上基于采样的GNN训练

图神经网络（gnn）已经成功地将深度学习从传统的欧几里得空间扩展到复杂的图结构。基于采样的GNN训练在不影响精度的情况下被广泛应用于大规模图。然而，图的不规则性导致了采样工作负载的不平衡，使得现有的GNN系统难以有效地利用GPU资源进行图采样。此外，在同时启用拓扑和特征缓存的GNN系统中，缓存数据的特征和目的的差异使这两种缓存的GPU内存分配变得复杂，并且开销最小。为了应对这些挑战，我们提出了GNNBoost，一个旨在加速GNN训练的框架。GNNBoost包括两个关键创新。首先，GNNBoost引入了一个面向度的采样计划，该计划根据训练顶点的度对其进行分组，并应用定制的采样策略来平衡GPU工作负载并提高采样性能。其次，GNNBoost开发了一种低开销的缓存空间分配机制，可以准确地确定跨不同工作负载的图拓扑和特征的最佳缓存大小，从而最大限度地减少空间和时间开销。我们通过各种GNN模型和大型图数据集对GNNBoost进行了全面评估，证明它明显优于现有的GNN训练系统。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Journal of Systems Architecture 工程技术-计算机：硬件

CiteScore

8.70

自引率

15.60%

发文量

226

审稿时长

46 days

期刊介绍： The Journal of Systems Architecture: Embedded Software Design (JSA) is a journal covering all design and architectural aspects related to embedded systems and software. It ranges from the microarchitecture level via the system software level up to the application-specific architecture level. Aspects such as real-time systems, operating systems, FPGA programming, programming languages, communications (limited to analysis and the software stack), mobile systems, parallel and distributed architectures as well as additional subjects in the computer and system architecture area will fall within the scope of this journal. Technology will not be a main focus, but its use and relevance to particular designs will be. Case studies are welcome but must contribute more than just a design for a particular piece of software. Design automation of such systems including methodologies, techniques and tools for their design as well as novel designs of software components fall within the scope of this journal. Novel applications that use embedded systems are also central in this journal. While hardware is not a part of this journal hardware/software co-design methods that consider interplay between software and hardware components with and emphasis on software are also relevant here.