图重排能加速图神经网络训练吗?实验研究

Nikolai Merkel, Pierre Toussing, Ruben Mayer, Hans-Arno Jacobsen
{"title":"图重排能加速图神经网络训练吗?实验研究","authors":"Nikolai Merkel, Pierre Toussing, Ruben Mayer, Hans-Arno Jacobsen","doi":"arxiv-2409.11129","DOIUrl":null,"url":null,"abstract":"Graph neural networks (GNNs) are a type of neural network capable of learning\non graph-structured data. However, training GNNs on large-scale graphs is\nchallenging due to iterative aggregations of high-dimensional features from\nneighboring vertices within sparse graph structures combined with neural\nnetwork operations. The sparsity of graphs frequently results in suboptimal\nmemory access patterns and longer training time. Graph reordering is an\noptimization strategy aiming to improve the graph data layout. It has shown to\nbe effective to speed up graph analytics workloads, but its effect on the\nperformance of GNN training has not been investigated yet. The generalization\nof reordering to GNN performance is nontrivial, as multiple aspects must be\nconsidered: GNN hyper-parameters such as the number of layers, the number of\nhidden dimensions, and the feature size used in the GNN model, neural network\noperations, large intermediate vertex states, and GPU acceleration. In our work, we close this gap by performing an empirical evaluation of 12\nreordering strategies in two state-of-the-art GNN systems, PyTorch Geometric\nand Deep Graph Library. Our results show that graph reordering is effective in\nreducing training time for CPU- and GPU-based training, respectively. Further,\nwe find that GNN hyper-parameters influence the effectiveness of reordering,\nthat reordering metrics play an important role in selecting a reordering\nstrategy, that lightweight reordering performs better for GPU-based than for\nCPU-based training, and that invested reordering time can in many cases be\namortized.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"34 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study\",\"authors\":\"Nikolai Merkel, Pierre Toussing, Ruben Mayer, Hans-Arno Jacobsen\",\"doi\":\"arxiv-2409.11129\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Graph neural networks (GNNs) are a type of neural network capable of learning\\non graph-structured data. However, training GNNs on large-scale graphs is\\nchallenging due to iterative aggregations of high-dimensional features from\\nneighboring vertices within sparse graph structures combined with neural\\nnetwork operations. The sparsity of graphs frequently results in suboptimal\\nmemory access patterns and longer training time. Graph reordering is an\\noptimization strategy aiming to improve the graph data layout. It has shown to\\nbe effective to speed up graph analytics workloads, but its effect on the\\nperformance of GNN training has not been investigated yet. The generalization\\nof reordering to GNN performance is nontrivial, as multiple aspects must be\\nconsidered: GNN hyper-parameters such as the number of layers, the number of\\nhidden dimensions, and the feature size used in the GNN model, neural network\\noperations, large intermediate vertex states, and GPU acceleration. In our work, we close this gap by performing an empirical evaluation of 12\\nreordering strategies in two state-of-the-art GNN systems, PyTorch Geometric\\nand Deep Graph Library. Our results show that graph reordering is effective in\\nreducing training time for CPU- and GPU-based training, respectively. Further,\\nwe find that GNN hyper-parameters influence the effectiveness of reordering,\\nthat reordering metrics play an important role in selecting a reordering\\nstrategy, that lightweight reordering performs better for GPU-based than for\\nCPU-based training, and that invested reordering time can in many cases be\\namortized.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"34 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.11129\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.11129","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

图神经网络(GNN)是一种能够在图结构数据上学习的神经网络。然而,在大规模图上训练 GNN 是一项挑战,因为需要在稀疏的图结构中结合神经网络操作,对相邻顶点的高维特征进行迭代聚合。图的稀疏性经常导致次优内存访问模式和更长的训练时间。图重排是一种旨在改进图数据布局的优化策略。事实证明,它能有效加快图分析工作负载的速度,但它对 GNN 训练性能的影响尚未得到研究。将重新排序推广到 GNN 性能并非易事,因为必须考虑多个方面:GNN 超参数(如层数、隐藏维数和 GNN 模型中使用的特征大小)、神经网络操作、大型中间顶点状态和 GPU 加速。在我们的工作中,我们通过对 PyTorch Geometric 和 Deep Graph Library 这两个最先进的 GNN 系统中的 12 种重新排序策略进行实证评估,缩小了这一差距。我们的结果表明,在基于 CPU 和 GPU 的训练中,图重排分别有效地减少了训练时间。此外,我们还发现,GNN 超参数会影响重排序的效果,重排序指标在选择重排序策略时起着重要作用,基于 GPU 的轻量级重排序比基于 CPU 的训练效果更好,而且在很多情况下,投入的重排序时间可以缩短。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
Can Graph Reordering Speed Up Graph Neural Network Training? An Experimental Study
Graph neural networks (GNNs) are a type of neural network capable of learning on graph-structured data. However, training GNNs on large-scale graphs is challenging due to iterative aggregations of high-dimensional features from neighboring vertices within sparse graph structures combined with neural network operations. The sparsity of graphs frequently results in suboptimal memory access patterns and longer training time. Graph reordering is an optimization strategy aiming to improve the graph data layout. It has shown to be effective to speed up graph analytics workloads, but its effect on the performance of GNN training has not been investigated yet. The generalization of reordering to GNN performance is nontrivial, as multiple aspects must be considered: GNN hyper-parameters such as the number of layers, the number of hidden dimensions, and the feature size used in the GNN model, neural network operations, large intermediate vertex states, and GPU acceleration. In our work, we close this gap by performing an empirical evaluation of 12 reordering strategies in two state-of-the-art GNN systems, PyTorch Geometric and Deep Graph Library. Our results show that graph reordering is effective in reducing training time for CPU- and GPU-based training, respectively. Further, we find that GNN hyper-parameters influence the effectiveness of reordering, that reordering metrics play an important role in selecting a reordering strategy, that lightweight reordering performs better for GPU-based than for CPU-based training, and that invested reordering time can in many cases be amortized.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
确定
请完成安全验证×
copy
已复制链接
快去分享给好友吧!
我知道了
右上角分享
点击右上角分享
0
联系我们:info@booksci.cn Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。 Copyright © 2023 布克学术 All rights reserved.
京ICP备2023020795号-1
ghs 京公网安备 11010802042870号
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术官方微信