TANGO: re-thinking quantization for graph neural network training on GPUs

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis Pub Date : 2023-08-02 DOI:10.48550/arXiv.2308.00890

Shiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu

{"title":"TANGO: re-thinking quantization for graph neural network training on GPUs","authors":"Shiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu","doi":"10.48550/arXiv.2308.00890","DOIUrl":null,"url":null,"abstract":"Graph learning is becoming increasingly popular due to its superior performance in tackling many grand challenges. While quantization is widely used to accelerate Graph Neural Network (GNN) computation, quantized training faces remarkable roadblocks. Current quantized GNN training systems often experience longer training time than their full-precision counterparts for two reasons: (i) addressing the quantization accuracy challenge leads to excessive overhead, and (ii) the optimization potential exposed by quantization is not adequately leveraged. This paper introduces Tango which re-thinks quantization challenges and opportunities for graph neural network training on GPUs with three contributions: Firstly, we introduce efficient rules to maintain accuracy during quantized GNN training. Secondly, we design and implement quantization-aware primitives and inter-primitive optimizations to speed up GNN training. Finally, we integrate Tango with the popular Deep Graph Library (DGL) system and demonstrate its superior performance over the state-of-the-art approaches on various GNN models and datasets.","PeriodicalId":124077,"journal":{"name":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","volume":"42 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2308.00890","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Graph learning is becoming increasingly popular due to its superior performance in tackling many grand challenges. While quantization is widely used to accelerate Graph Neural Network (GNN) computation, quantized training faces remarkable roadblocks. Current quantized GNN training systems often experience longer training time than their full-precision counterparts for two reasons: (i) addressing the quantization accuracy challenge leads to excessive overhead, and (ii) the optimization potential exposed by quantization is not adequately leveraged. This paper introduces Tango which re-thinks quantization challenges and opportunities for graph neural network training on GPUs with three contributions: Firstly, we introduce efficient rules to maintain accuracy during quantized GNN training. Secondly, we design and implement quantization-aware primitives and inter-primitive optimizations to speed up GNN training. Finally, we integrate Tango with the popular Deep Graph Library (DGL) system and demonstrate its superior performance over the state-of-the-art approaches on various GNN models and datasets.

查看原文本刊更多论文

TANGO:重新思考图形神经网络在gpu上的量化训练

图学习由于其在解决许多重大挑战方面的卓越性能而变得越来越受欢迎。虽然量化被广泛用于加速图神经网络(GNN)的计算，但量化训练面临着显著的障碍。目前的量化GNN训练系统通常比全精度GNN训练系统需要更长的训练时间，原因有两个:(i)解决量化精度挑战会导致过大的开销，以及(ii)量化暴露的优化潜力没有得到充分利用。本文介绍了Tango，它重新思考了图形神经网络在gpu上训练的量化挑战和机遇，有三个贡献:首先，我们引入了有效的规则来保持量化GNN训练的准确性。其次，我们设计并实现量化感知基元和基元间优化，以加快GNN的训练速度。最后，我们将Tango与流行的深度图库(DGL)系统集成，并在各种GNN模型和数据集上展示了其优于最先进方法的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文求助全文

来源期刊

Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis

自引率

0.00%

发文量